3

I have a very simple grammar that looks like this:

grammar Testing;

a :  d | b;
b : {_input.LT(1).equals("b")}? C;
d : {!_input.LT(1).equals("b")}? C;
C : .;

It parses one character from the input and checks whether the it's equal to the character b. If so, rule b is used, and if not, rule d is used.

However, the parse tree fails the expectation and parses everything using the first rule (rule d).

$ antlr Testing.g4
$ javac *.java
$ grun Testing a -trace                                                                                                                                                                                                                                     (base) 
c
enter   a, LT(1)=c
enter   d, LT(1)=c
consume [@0,0:0='c',<1>,1:0] rule d
exit    d, LT(1)=

exit    a, LT(1)=

$ grun Testing a -trace                                                                                                                                                                                                                                     (base) 
b
enter   a, LT(1)=b
enter   d, LT(1)=b
consume [@0,0:0='b',<1>,1:0] rule d
exit    d, LT(1)=

exit    a, LT(1)=

In both cases, rule d is used. However, since there is a guard on rule d, I expect rule d to fail when the first character is exactly 'b'.

Am I doing something wrong when using the semantic predicates?

(I need to use semantic predicates because I need to parse a language where keywords could be used as identifiers).

Reference: https://github.com/antlr/antlr4/blob/master/doc/predicates.md

1 Answer 1

3

_input.LT(int) returns a Token, and Token.equals(String) will always return false. What you want to do is call getText() on the Token:

b : {_input.LT(1).getText().equals("b")}? C;
d : {!_input.LT(1).getText().equals("b")}? C;

However, often it is easier to handle keywords-as-identifiers in such a way:

rule
 : KEYWORD_1 identifier
 ;

identifier
 : IDENTIFIER
 | KEYWORD_1
 | KEYWORD_2
 | KEYWORD_3
 ;

KEYWORD_1 : 'k1';
KEYWORD_2 : 'k2';
KEYWORD_3 : 'k3';

IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]*;
1
  • Thank you. That getText was a mistake when I try to simplify the grammar that I am actually having trouble with. But your suggestion solved my problem. Thank you!
    – uucp
    Commented Jul 30, 2020 at 20:37

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.