Blog : June 2007
posted on June 27, 2007 - 03:21 PM PDT by John Skaller
The Felix SVN repository is currently unstable because I'm upgrading the front end to support new parser technology.
Felix has adopted Dypgen as its parser. Dypgen is an extensible GLR parser. GLR (Generalised LR) parsers explore all alternatives in parallel, one progressing one token at a time for all branches. If a branch reaches a dead end, it is dropped. If sereral branches for one non-terminal succeed, they are merged under user control by a variety of strategies.
Dypgen allows new grammar to be added on the fly, and Felix was extended with a parse time directives which allows definition of new statements and expressions. This feature replaces the old pre-processor based extensions mechanism which used a recursive descent executable parser to interpret the user grammar.
However, the facility only works for expressions and statements, because the user actions are simply template expressions or statements, into which expressions and statements can be plugged. The macro processor handles the plugging, but it is only coded to handle expressions and statements. There's no way to handle things like function parameter lists, which aren't expressions or statements.
To solve this problem, Felix is switching to using Scheme programs as user actions. These programs are executed when a non-terminal is reduced, and the resulting s-expression is the new kind of AST. Each action can use the s-expressions produced by the non-terminals in its production as arguments to create a new s-expression.
These s-expressions are the native Scheme types of OCS scheme, which is written in Ocaml and embedded in the compiler. The final OCS scheme AST is then converted to an intermediate tree using a simpler s-expression type, and then that is converted into a Felix AST.
The advantage of this mechanism is that all the non-terminals return the same type and so all can be extended. The only restriction is that the final s-expression must resolve to a Felix AST. The main disadvantage is that Scheme is dynamically typed, so errors in user actions may not become apparent until the final conversion to the Felix AST form.
The effect of the new technology is that almost all the grammar will be moved out of the compiler into the library .. and the technology used to load it will be available to end users, allowing user packages to specify not just new types and functions, but also new syntax adapted to the domain.
In other words, the effect is intended to support user creation of Domain Specific Sub-Languages.