Moving away from flex to a handwritten lexer instead

Written by Walter on 1/1/2015


Sometimes people hold on to standards and forget computer science evolves constantly.
Way back I wrote an example interpreter called wsbasic and tried to make it as simple and readable as I possibly could and thereby wrote a lexer without using flex just to keep things simple.
For MPL around 2003 I started tuning our flex based lexer class which was based on that (reason being the environment needed to be ported to multiple OS's like windows, linux, os x, etc.). So yesterday while surfing around I found on the gnu site they've done the same. So that gives me a lot of hope it was a good path to follow and move away from flex+bison.

New Languages and Language specific improvements in their advice says for C and Objective-C : The old Bison-based C and Objective-C parser has been replaced by a new, faster hand-written recursive-descent parser.

Found the link here: Advice 2

Just food for thought. I have to look through some of my backed up sourcecode from 2000-2005 and run another benchmark again to know for sure. Ofcourse, you won't have full regular expressions with this wsbasic style lexer class but from experience most of the time you don't need them. (you could do full regexps with boost libs also if you need a full regular expression engine but that would likely end up being slower again than a flex generated regexp parser).

At Emory University I gave a talk about MPL around 2005 and also had the same question from someone in the audience about why I didn't use flex for lexing. And my answer was mostly because we needed to be cross platform but also in our case it turned out to be quite clean and maintanable code as well.

Hope I can explore a lot more in 2015 and maybe even more magic will happen ;).


Back to archive