Capturing absolute offsets for JavaCC/JJTree tokens

I use JavaCC for generating parsers in Java. And use JJTree to create AST after parsing. JJTree creates nodes of the AST and you can configure JavaCC options to capture tokens in the node – i.e. if you want each node to contain start and end tokens. The default code generated by JavaCC creates Token class with offsets that are relative to the starting offset of the line. It has fields like beginLine, beginColumn, endLine and endColumn. Here the line numbers are absolute line numbers (starting from 1) and column fields contain offsets (again starting from 1) within the corresponding lines.
However many times you want to capture absolute offsets of tokens in the input stream, and not just relative offset in the line. I wish there was a JavaCC option to enable this. But it is not too complex if you want to do it yourself.

To explain how to do this, I will take a grammer file that is generated by the JavaCC wizard of JavaCC Eclipse plugin. This is the default grammer file it generates – Continue reading “Capturing absolute offsets for JavaCC/JJTree tokens”

Handling some of the warnings and errors generated by JavaCC

I am currently building a parser using JavaCC. I have used JavaCC in the past, but whenever I use it after a long gap, I have to relearn a few things about it – particularly handling warnings. So I thought this time I would blog about ways to handle some of the frequent warnings that I have seen.

If you are unfamiliar with JavaCC, then it is a parser generator. You create grammer using EBNF (Extended Backus-Naur Form) and feed it to JavaCC. JavaCC then creates Java classes for the parser. I do not want to make this post into JavaCC tutorial. There are some very good tutorials available at JavaCC Documentation page and FAQ. I especially find Lookahead MiniTutorial and Token Manager MiniTutorial very useful. If you use Eclipse IDE, then you would find JavaCC plugin for Eclipse useful – it provides wizard to create JavaCC or JJTree (JJTree creates AST, Abstract Syntax Tree, after parsing the input) files, provides code colorization, outline, code hyper link, syntax checking and compilation. You can also set JavaCC debug options easily using this plugin.

I will use following tokens that are generated by default if you use the wizard provided by JavaCC Eclipse plugin to create a JavaCC grammer file. I have created a .jjt file for examples in this blog.

Continue reading “Handling some of the warnings and errors generated by JavaCC”