0

In gereral I am trying to create a Java based application, where I can compile a dictionary of terms which supports simple regular expressions. The dictionary will then be used to create a simple entity tagger, where a recognised term is marked in a text. Thought that ANTLR might provide all I need. Trying to create a Java application which does not depend on the compiled grammar and lexer files, since the grammar has to be updated in runtime every few minutes.

Here my simple "Hello World" application:

LexerGrammar lg = new LexerGrammar(
                "lexer grammar L;\n" +
                "A : ('a'|'A');\n" +
                "B : ('b'|'B');\n" +
                "C : ('c'|'C');\n" +
                "D : ('d'|'D');\n" +
                "FILL_TOKEN : (.);\n");

Grammar g = new Grammar(
                "parser grammar T;\n" +
                "t_abc : A FILL_TOKEN? B FILL_TOKEN? C;\n" +
                "t_abcd : A FILL_TOKEN? B FILL_TOKEN? C FILL_TOKEN? D;\n" +

                "rule0   : t_abcd|t_abc;\n" +

                "ws  : '.' -> skip ;\n",
                lg);

LexerInterpreter lexEngine =
                lg.createLexerInterpreter(new ANTLRInputStream("Test A BCD"));
CommonTokenStream tokens = new CommonTokenStream(lexEngine);
ParserInterpreter parser = g.createParserInterpreter(tokens);
Rule rule = g.rules.get("rule0");
ParseTree t = parser.parse(rule.index);

System.out.println(t.getText());

When I try to compile the application, I am getting the following error

Exception in thread "main" java.lang.NullPointerException
    at org.antlr.v4.runtime.atn.ATNSerializer.serialize(ATNSerializer.java:73)
    at org.antlr.v4.runtime.atn.ATNSerializer.getSerialized(ATNSerializer.java:601)
    at org.antlr.v4.runtime.atn.ATNSerializer.getSerializedAsChars(ATNSerializer.java:605)
    at org.antlr.v4.tool.Grammar.createParserInterpreter(Grammar.java:1337)
    at main.OnTheFly.main(OnTheFly.java:98)

When I comment out the "ws : '.' -> skip ;\n", part of the grammar, the program runs, but it complains that Test is not known.

What I am doing wrong or does the default grammar not support the skip parameter? Using Antlr 4.7.2 and Java 1.8.0 (131)

1 Answer 1

1

Found the answer. Only the lexer supports the skip parameter and in addition, I only required the lexer all together. The matches can be retrieved, by looking at the result tokens:

... 
// using code from above with grammar part, including SKIP rule.
// In additions, all tokens have to be defined in
// ...

// required to process the input stream
tokens.fill();
for (Token token : tokens.getTokens()) {                
    int typeId = token.getType();
    if (-1 == typeId) {
        break;
    }

    String ruleName = lexEngine.getRuleNames()[token.getType() - 1];
    System.out.println("Token: " + token.getText() + " - " +   ruleName);
}

More information about the lexer and grammar vocabulary can be found here:

https://github.com/antlr/antlr4/blob/master/doc/lexer-rules.md

https://github.com/antlr/antlr4/blob/master/doc/parser-rules.md

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.