write ANTLR regex when declaring variables list

Question

I've wrote a grammar rule for a language in ANTLR as below:

variable: idlist COLON type (EQUAL explist)? SEMI;
idlist: identifier (COMMA identifier)*;
explist: exp (COMMA exp)*;

COLON: ':';
EQUAL: '=';
SEMI: ';';
COMMA: ',';

This input is valid for above grammar:

a, b, c: integer = 3, 4, 6;

But now if I want this input:

a, b, c, d: integer = 3, 4, 6;

or this:

a, b, c: integer = 3, 4, 6, 1;

becomes invalid due to inequality between amount of ID in idlist and value in explist, how I rewrite my grammar? Tks

Easy. Use a "pumping" form of "variable". Try this: grammar Foo; variable: var_ ';' EOF; var_ : identifier var2 exp ; var2: ',' identifier var2 exp ',' | type_info ; type_info : ':' type '='; identifier: Id; Id: [a-zA-Z_]+; type: Id; exp: Num; Num: [0-9]+; WS: [ \t\n\r]+ -> skip;. Make sure to use full grammars and examples in questions. In Antlr, make sure to use an EOF-terminated start rule. — kaby76, Commented Feb 26, 2023 at 11:58

Mike Lischke · Accepted Answer · 2023-02-26 09:46:59Z

3

Don't let your grammar handle this semantic task. The syntax is correct in all those cases. The constraint that the number of left-hand-values and right-hand-values must be equal is a semantic rule you should enforce in the semantic phase, following the parse step (the syntactic phase). This is usually done by evaluating the generated parse tree.

answered Feb 26, 2023 at 9:46

Mike Lischke

53.8k18 gold badges137 silver badges206 bronze badges

but if we are forced to write that, how do we write?
– Duy Duy
Commented Feb 26, 2023 at 10:02
@DuyDuy that is not possible other than manually writing 1 id + 1 exp, 2 ids + 2 exp, 3 ids + 3 exp, ...
– Bart Kiers
Commented Feb 26, 2023 at 11:17
@BartKiers there is a way to write it in ANTLR, yet I cannot find
– Duy Duy
Commented Feb 26, 2023 at 11:34
@DuyDuy, no, you are wrong, there is no generic way to define an X amount of ids with an equal X amount of expressions in an ANTLR grammar. You could do something with a predicate, but you cannot do it in plain ANTLR syntax.
– Bart Kiers
Commented Feb 26, 2023 at 11:37
@BartKiers do you know how to replicate a group of tokens n times, if we write regex in ANTLR?
– Duy Duy
Commented Feb 26, 2023 at 11:42

| Show 2 more comments

Pavel Ganelin · Accepted Answer · 2023-02-28 15:59:12Z

2

I agree with the previous answer that the proper way to do it would be later during the evaluation of the generated parse tree/AST.

But if you insist, here is an example of the grammar which matches the number of declarations and initializations. Please note that it is more an exercise in demonstrating the power of context-free grammar than something I would be glad to find in the production code.

grammar Test;

list: IDENTIFIER middle  exp ';';

middle :
   | ',' IDENTIFIER middle exp ','
   | ':' type '='
;

exp:
   NUMBER
;

type:
    IDENTIFIER
;

NUMBER : [0-9]+;
IDENTIFIER : [a-z]+;
COLON: ':';
EQUAL: '=';
SEMI: ';';
COMMA: ',';

In case you have a mismatch in the list sizes the error message is confusing:

line 1:11 mismatched input ',' expecting ';'

This is why I would not recommend using this approach.

edited Feb 28, 2023 at 15:59

answered Feb 28, 2023 at 15:41

Pavel Ganelin

3142 silver badges10 bronze badges

Haha, yes, you're right. But now you also extend your answer by telling the OP how they can easily match each identifier to their corresponding expression in this horribly (recursive) generated parse tree 😉 (nevertheless, +1 for the "horrible" solution :))
– Bart Kiers
Commented Feb 28, 2023 at 19:31
This is a good answer. It is difficult to determine which token(s) is basic case in recursion to write the rule
– Duy Duy
Commented May 6, 2023 at 1:53

Add a comment |

Collectives™ on Stack Overflow

write ANTLR regex when declaring variables list

2 Answers 2

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Related