Tuesday, March 23, 2010

The Dangers Of GroupBy

I had a complex join (three tables) with a group by and a where cause with three conditions. It seemed to be working, until I added more data; I eventually narrowed problem down to just a single table. Consider this query:

SELECT w.user_id, w.some_date, w.id
FROM mytable w
GROUP BY w.user_id


id is the primary key. That query returns all the data:

user_id some_date id
1 2010-02-01 1
5 2010-02-02 2
1 NULL 3
1 2010-03-22 4
1 2010-03-24 5


I'm only interested in the record with the latest date for each user, so I add a MAX on some_date and GROUP BY user_id:

SELECT w.user_id, MAX(w.some_date), w.id
FROM mytable w
GROUP BY w.user_id


That gives:

user_id MAX( w.some_date ) id
1 2010-03-24 1
5 2010-02-02 2


Hang on! The dates are right, but 2010-03-24 is from id=5, not id=1. The some_date column has been max-ed in isolation.


This issue is explained in an excellent two-part article: part1 part2

In fact it is so excellent I didn't fully understand it, and am going to re-read it three times a day until I do. I really need to understand it because, as the author shows, it is very easy to get correct answers from a query on your test data, pass all your unit tests, and deliver something that goes wrong when one more record is added.

My query is simpler than the one in his example, but after many tries I cannot find a solution. The "join(...)as d" syntax does not seem to work in MySQL or I am not using it correctly. It seems what I'm trying to do is very basic, so I do not understand why I cannot find more advice on the subject. I'm open to suggestions!

(For the moment I'm going to give up, do a simpler query and have PHP process the results to get the data I actually want.)

Sunday, March 7, 2010

Image Protection in Zend Framework

Sometimes you want to only show certain images (or other media) to certain users (e.g. those that are logged in, or those that have paid for it). This quick tutorial will show how to write the controller for Zend Framework for serving images. This is for ZF 1.10, but as far as I can tell it is not using anything new or unusual.

This tutorial does not cover the validation part of the code.

We start with a controller that does not use views, layout and all that stuff:
class ImgController extends Zend_Controller_Action
{
public function init()
{
$this->_helper->viewRenderer->setNoRender(true);
$this->_helper->layout()->disableLayout();
}
}


Now add an action that will serve all images:
public function imgAction(){
$type=$this->getRequest()->getParam(1);
$fname=$this->getRequest()->getParam(2);
$ext=$this->getRequest()->getParam(3);

echo "type=$type, fname=$fname, ext=$ext";
}

public function badAction(){
$this->getResponse()->setHeader('Content-Type','image/jpeg');
}


The idea is that URLs such as http://127.0.0.1/img/folder/abc.jpg will end up at the imgAction() function, and $type will get set to "folder", $fname to "abc" and $ext to "jpg". To do that we need to set up what is called a router, which we will do next. The badAction() will handle any problems; this crude code will send back a 0 byte jpeg.

To create a router, jump to your bootstrap file and create a function something like this:
public function _initCustomRouting(){
$frontController=Zend_Controller_Front::getInstance();
$router=$frontController->getRouter();
$router->addRoute(  //To catch any that are not formatted correctly
'imgHandlingBad',
new Zend_Controller_Router_Route('img/*',
array('controller'=>'img', 'action'=>'bad')
)
);
$router->addRoute(  //Formatted as /url/type/fname.ext
'imgHandling',
new Zend_Controller_Router_Route_Regex('img/(.+)/(.+)\.(.+)',
array('controller'=>'img', 'action'=>'img')
)
);
}


Note that order: imgHandlingBad must come before imgHandling. I found this introduction (7 minute screencast) to using Zend_Controller_Router_Route very useful; then some trial and error and study of the ZF source code.

Now you should be able to test http://127.0.0.1/img/folder/abc.jpg and see our debug comment. But without a folder (e.g. http://127.0.0.1/img/abc.jpg) it will get handled by the badAction(), as will those without an extension: http://127.0.0.1/img/folder/abcjpg

Now the final step is to feed back some images, so in imgAction() replace the echo line with this code:
$path=$basePath.$type;
$fullPath=$path.$fname.'.'.$ext;
if(!file_exists($fullPath))return $this->badAction();

switch($ext){
case 'jpg':$mime='image/jpeg';break;
case 'png':$mime='image/png';break;
case 'gif':$mime='image/gif';break;
default:return $this->badAction();
}
$this->getResponse()->setHeader('Content-Type',$mime);

readfile($fullPath);


It decides the filename, decides the mime-type based on extension, and then readfile passes on the binary data.

In the above code we use the $type directly to decide the image path. You could instead map type to different directories. E.g.
switch($type){
case 'www':$path='/var/www/main_images/';break;
case 'www/articles':$path='/var/www/html/special/articles/';break;
case 'family':$path='/home/user/images/family/';break;
default:return $this->badAction();
}


Another way you might use $type is if certain users can only see certain types of images. You could do different validation checks in each case statement above.

Got any suggestions to make this code better? Let me know!

P.S. One special note about this technique. In the typical MVC Zend Framework setup, if you create public/img/ then apache will serve images from there; only if the image is missing in that location will Apache ask the Zend Framework to serve it. You might use this to your advantage to speed up delivery of certain common images. But it also opens the way to accidentally serving images that are supposed to be protected.

Tuesday, March 2, 2010

Creating A Doctrine Custom Behaviour, Part 2

In part 1 we made a simple custom behaviour and then gave it some options we could customize in our schema. The second half of the power of Doctrine's behaviours though is being able to set values. First, uncomment this line in the last example in part 1:

$this->addListener(new DarrenTestableListener($this->_options));


You should get told class DarrenTestableListener does not exist, so create DarrenTestableListener.php in the same directory as DarrenTestable.php (though, as mentioned before, it can go anywhere in your models directory tree), and fill it with this code:

class DarrenTestableListener extends Doctrine_Record_Listener
{
protected $_options = array();

public function __construct(array $options)
{
$this->_options = $options;
}

/** Called when a new record is created */
public function preInsert(Doctrine_Event $event)
{
$name = $event->getInvoker()->getTable()->getFieldName($this->_options['name']);
$modified = $event->getInvoker()->getModified();
if ( ! isset($modified[$name])) {
$event->getInvoker()->$name = "C";
}
}

/** Called when an existing record is updated */
public function preUpdate(Doctrine_Event $event)
{
$name = $event->getInvoker()->getTable()->getFieldName($this->_options['name']);
$modified = $event->getInvoker()->getModified();
if ( ! isset($modified[$name])) {
$event->getInvoker()->$name .= 'U';
}
}

/** Handle dql update queries */
public function preDqlUpdate(Doctrine_Event $event)
{
$params = $event->getParams();
$name = $event->getInvoker()->getTable()->getFieldName($this->_options['name']);
$field = $params['alias'] . '.' . $name;
$query = $event->getQuery();
if ( ! $query->contains($field)) {
$query->set($field, '?', 'U');
}
}

}


It has default options (and a constructor to set the options), then one function to handle INSERTs and two functions to handle UPDATEs. (There are also hooks for DELETE, save and validation.) All three functions follow a similar pattern:

  1. Get the real fieldname
  2. See if we need to do anything
  3. If so, set the field


What we do is set the field to "C" when it is created. Then each time it is updated we append a "U". E.g. after three updates it will look like "CUUU". (TODO: I'm not so familiar with DQL, and I am not using it, so the DQL version replaces the existing data instead of appending. If you can supply the proper code for it let me know!)

Incidentally if you set a default in the options that will get precedence over your code (that is what the getModified() call is for, I assume). E.g. if I change my YAML schema to:

ActsAs:
DarrenTestable:
name: being_silly
options:
default: Hello!


Then after a couple of updates being_silly will contains "Hello!UU".

Creating A Doctrine Custom Behaviour, Part 1

Doctrine is a wonderful ORM library, but the manual is..., well, let's just say I think it was written by the developers, and out of obligation :-)

Behaviours are a motivating feature for using Doctrine, so here is a two-part tutorial on how to add your own minimal behaviour. The first part will show how to add a field to a table. The second part will show how to have its contents set automatically. For a more in-depth example see here or study the source code of the core behaviours.

The first thing you need to know is to put your behaviour classes in your models directory, or a sub-directory of your choice. You don't need a directory structure that matches the "XXX_YYY_" prefixes of the class name. This is assuming you are using Doctrine's auto-loading of table classes; if not you can put them anywhere you like of course. I'm putting them in models/behaviors/ (yes, using American spelling!)

Make a file called "DarrenTestable.php" with these contents:

class DarrenTestable extends Doctrine_Template{
public function setTableDefinition(){
$this->hasColumn('darren_testing','string');
}
}


See, I told you it was minimal! To add it to an existing class, add this to your YAML schema:


ActsAs:
DarrenTestable


Recreate the class from the schema and you should see a field called "darren_testing". On MySQL the type is "text" (i.e. this is what "string" maps to if you are using MySQL).

Even that very small example is doing something useful: if this behaviour is used in 10 classes and we need to change the name or data type (or set a default, add an index, etc.) we only need to edit it one place.

While still staying within the scope of a minimal example, we can expand it in two directions. We can have our code automatically set the field value (see part 2 of this article), and we can allow users to customize it. Here is how we add some options, with defaults:


class DarrenTestable extends Doctrine_Template
{
protected $_options = array(
'name'=>'darren_testing',
'type'=>'string',
'options'=>array(),
);

public function setTableDefinition()
{
$this->hasColumn($this->_options['name'],
$this->_options['type'], null, $this->_options['options']);
//$this->addListener(new DarrenTestableListener($this->_options));
}
}


I have added the call to the listener we'll use in part 2, but commented it out for the moment. The options contains a sub-array called options which is passed on to Doctrine_Table::setColumn() and can be used to set defaults and so on.

Usage is exactly like above if we want the defaults. If we want to change the name, and require it to be set (i.e. NOT NULL) you would do:

ActsAs:
DarrenTestable:
name: being_silly
options:
notnull: true