Sunday, February 15, 2009

PHP comments from PHP!

php|architect December 2008 edition's most interesting article was on phpdoc. I'm a fanatic when it comes to the javadoc style of commenting. When I create a new function the first thing I do is write function somename(){}, then the second thing I do is write /** */ above it.

If I'm refactoring then I might copy and paste in some code straightaway, but usually the third thing I do is document. I describe what the function does, er, will do, and usually define the @return tag (if I've not worked out what data, if any, it should return then I've got ahead of myself - how can I even name a function if I don't know what it produces??). The @param tags I usually leave until after I've written and tested the function, as they can change a lot (i.e. I need an algorithm before I know what inputs my algorithm takes).

During the writing of the function body I will usually jump up to the comment block and add an @internal comment (describing the implementation), or a @todo comment ("Need to add error-checking on call to somefunc()" is a very common one!).

To quote from the php|a article: "What's really exciting,though, is that since PHP 5.1, the Reflection classes can also supply the subject's "doc comment".

See http://php.net/reflection

The php|a article introduced a base class for automatically creating setters for class vars. My own interest is in somehow using it for contract based programming (also called DBC: Design By Contract).

I believe the Java implementation of DBC uses docblock entries, but with a pre-compiler. Pre-compilers suck: being able to edit and run the same source code file is a big bonus, and maybe reflection of docblocks enables that?

Of course using comments is poor cousin of DBC built into the language itself (especially if it can do compile-time analysis). But, for some reason I've not yet grasped, language designers just don't understand how important it is. They limp along with lint and asserts instead.

Back to the docblocks, I've not thought out the details; does PHP provide some hook that can be called when each function is entered and exited? If not I guess you'd have to write VALIDATE() as the first line of each function, and a RETURN() function everywhere you use return, which would be rather intrusive.

Either way, it could then look for @paramvalidate tags which could describe valid data to be found in parameters. It could look for @variant tags in the class doc block to check the object is valid on each of entry and exit.

Did I mention DBC finds loads of bugs, without you having to write a single unit test; all you have to do is document properly. Which, of course, you are going to do anyway if you are a professional programmer.

But did you realize DBC can also be used for optimization, by prioviding hints to the optimizer? As a concrete example (I'm thinking about C++ here) you have two input variables both described as being in the range 1..100. When they are multiplied together the range is 1..10000. Therefore there is no need to consider overflow. If you then divide by that variable then a divide by zero error is impossible. I'm not a compiler writer, and haven't written assembly in about 15 years, but I'm sure the above information, or similar, can help.

As another example, that also applies to PHP, static analysis of possible values can tell you that for an if/elseif/elseif/else block only the second elseif block can ever be reached. This can be both reported as a compiler warning and all the conditional stuff thrown away.

But, that is all a pipe dream, it requires DBC being implemented in the language itself, and I've drifted from the point of this article. Which was: a PHP script can read its own comments! Cool!