Tuesday, June 24, 2008

Boost Spirit, shadowing, and a trailing newline

I've never been a fan of Boost::Spirit. It adds loads of compile time (in fact, some code failed to compile when I only had 512Mb, solely due to spirit), and the source code has shadowed variables (which causes complaints if you switch all warnings on, and therefore causes failure if you compile with -Werror).

I sent a patch in against 1.33 for the shadowed variable problems. So you can imagine my disappointment when I tried to compile against boost 1.34.1 (that comes with Ubuntu 7) and the problems were still there and required me to hack the source code again.
But that was nothing compared to how I felt when my unit tests failed to run. No compile errors, they simply failed to parse. My code was unchanged; only the boost version had changed.

I have finally tracked this down to if there is a trailing carriage-return it won't parse it. I thought the space_p parameter to parse() would take care of that, and perhaps that is the behaviour that changed between 1.33.1 and 1.34.1??

But most frustrating of all is that none of these solutions work:
'\n'
"\n"
ch_p('\n')
*ch_p('\n')
+ch_p('\n')
str_p("\n")
*str_p("\n")
+str_p("\n")

Actually I'm out of ideas!

Here is a code snippet:
GameTree = ch_p('(')[bind(&SGFMoveList::on_game_start, this, _1, fname)]
>> RootNode[bind(&SGFMoveList::on_root_node_end, this, _1,_2)]
>> *Node[bind(&SGFMoveList::on_node_end, this, _1,_2)]
>> *VariationTree
>> !ch_p(';')
>> ch_p(')')[bind(&SGFMoveList::on_game_end, this, _1)]
>> *ch_p('\n');

parse_info<const char*> info=parse(str, *GameTree , space_p);


(I've left a ch_p('\n') in there at the end, just in case it is doing some good.)

So, my solution is to check info.hit instead of info.full! hit is true if the stuff you want matched; full is the same but only gets set if the whole input string got matched. In my case a carriage-return is not getting matched, so hit is true and full is false.

That is a bit crude and may be hiding a genuine problem. So I then have a look at info.stop (which is a pointer to the part of the input string that didn't get matched) to make sure it only contains whitespace.

If anyone knows what is going on please let me know. It amazes me that no-one else has noticed this regression in Spirit, so surely it must be something I'm doing wrong?

P.S. On the next project where I needed a parser I tried Hapy (http://hapy.sourceforge.net/) instead of Spirit. It was much faster and easier to use. Its downsides are practically no-one uses it, so not much documentation, community, etc.
On most projects since then I've used Boost::tokenizer wherever possible, going out of my way to avoid writing a real parser.
The above project, where I'm using Spirit, is complex, needs a real parser, and it works, hence my reluctance to port it to Hapy, or look for something else.

No comments: