Parsing with Rhino is quite simple and well documented.
AstNode class has methods like getSymbols and getSymbolTable which I presumed would give me all variables and functions declared in the code. However these methods returned only functions and not variables declared. So I had to traverse the tree and get all symbols. I thought I would use visit method of AstRoot to process all nodes.
However the above code threw ClassCastException exception in the visit method of ScriptNode class.
And the test code was a simple JS script –
So finally I decided to traverse the AST from root and process each node. That is when I realized Rhino created AST differently from that AST I had worked with earlier. The AST I had worked with created nodes with their children as array of nodes. If you start with the root and visit each child, you are sure to traverse the entire tree and hence the code.
However Rhino does not create nodes with array of child nodes. Each node has methods to access the first node and then you can traverse remaining child nodes by accessing ‘next’ member. So instead of an array, it creates linked list of child nodes, which is not such a big problem. However a few things surprised me when I traversed the tree.
I started with the root node and for each node I followed following rules –
- Get first child and traverse the linked list by calling next till the next is null
- Call next on the parent node and repeat the loop.
Here is what I found –
- When you traverse as above, you will only visit function name, and not function body
- If you want to visit function body, you will have to call getFunctions (or getFunctionNode) method of ScriptNode. Both AstNode and FunctionNode are of type ScriptNode.
- If you traverse the node as I described above (children first and then next node), you are not guaranteed to visit all AST nodes.
For the simple snippet of JS code above, the AST creates is as follows
As you can see, only function Name node is in the tree. AstRoot has two children – Name (Function) and Node (Expr_Result). Expr_Result node has one child – Node (setName). SetName has two children- Name (BindName) and NumberLiteral. This tree represents the entire code, except body of the function test. As mentioned earlier, you can get body of the function by calling getFunctionNode on AstRoot.
Now, if you inspect Name (BindName) node, you will see that it’s parent is Assignment node. Assignment has left node as Name(BindName) and right node as NumberLiteral. However Assignment is not part of the tree if you traverse it as described above.