Suppose you wish to have ANTLR recognize non-degenerate tuples of expressions, like (x, y)
or (f(g), a, b)
but not (f)
or ()
. Trailing commas are not allowed. The following formulation of such a rule will likely elicit a complaint (“rule tuple has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2; …”):
tuple : '(' (expr ',')+ expr ')' ;
ANTLR’s error message seems unhelpful at first glance, but it’s trying to point you in the right direction. What ANTLR is saying is that after it reaches a comma, it doesn’t know whether it should expect expr ')'
or expr ','
, and parsing expr
could requiring looking at an arbitrary number of tokens. ANTLR recommends left-factoring, but the structure of the rule as written doesn’t make it clear what needs to be factored. Re-writing the rule to make it closer to ANTLR’s view makes it clearer what the problem is:
tuple_start : '(' tuple_continue;
tuple_continue
: ( expr ',' tuple_continue
| expr ')');
Now it’s more obvious where the left-factoring needs to be applied, and how to do it by writing expr
once, leaving only the choice of seeing a comma or a close-paren:
tuple_start : '(' tuple_continue;
tuple_continue
: expr ( ',' tuple_continue
| ')' );
The rewritten expansion corresponds to an alternative structuring of the tuple rule that will satisfy ANTLR:
tuple : '(' expr (',' expr)+ ')';
The ANTLR wiki has a page on removing global backtracking from ANTLR grammars, which has a nice presentation of how to do left-factoring when the opportunity is obvious.