For a while now we've been hosting what I consider to be quite the hidden gem in the infrastructure of the 4Suite project: BisonGen.
BisonGen is a Python tool that reads in an input file in a simple XML format based on Bison's text format, and creates LALR parsers in both pure Python and as a Python/C extension. This way the resulting parsers have a fast version and a more portable version, for maximum flexibility.
In this article I provide information on BisonGen, in preparation for more complete packaging to come later (probably by the next 4Suite release).
The latest version of BisonGen can always be downloaded from the FTP site. The most recent release was 0.8.0b1 in mid April. See Jeremy's announcement.
See the simple, built-in example to get a picture of what BisonGen expects for input. More sophisticated examples are in 4Suite: in Ft/Xml/XPath, Ft/Xml/XSLT and Ft/Rdf/Parsers/Versa. Martin v. Löwis presented "Towards a Standard Parser Generator" at IPC10. His overview of BisonGen is very useful. He did note the big performance advantage of BisonGen parsers over pure Python counterparts (assuming, of course, that you use the resulting C parser from BisonGen).
Some other useful resources on BisonGen:
- BisonGen page on moinmoin.wikiwikiweb.de
- BisonGen parser notes by Gabriel Wicke
- See Ned Batchelder's "Python Parsing Tools" for a comparative discussion of the whole class of tools