#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Can't find/fix shift reduce error


    I am trying to write a ST ( structured text ) compiler that follows the IEC 61131-3 spec. It is Pascal like.
    I am an experienced programmer but a new at writing compilers. When I compile my test code for a case statement

    CASE A OF 0: B:=0; 1: B:=1; 2: B:=2; END_CASE

    The parser ends up wind up in the wrong state. The code compiles OK to where the '.' is

    CASE A OF 0: B:=0; . 1: B:=1; 2: B:=2; END_CASE

    The 0: B:=0; is a case_element and it should reduce but it doesn't so instead of working on the next case_element ( 1: B:= 1; ) it thinks it is starting another statement.

    The syntax does not use a break keyword to indicate the end of a case element. This is what is giving me problems. Here are my rules that are pertinent to my problem.

    Code:
    stmt:
    		| CASE expr OF case_element_list END_CASE	{ $$ = opr(CASE,2,$2,$4); }
    		| CASE expr OF case_element_list ELSE stmt_list END_CASE { $$ = opr(CASE,3,$2,$4,$6); }
            ;
    
    case_element_list:
    			case_element						{ $$ = $1; }
    		|	case_element_list case_element		{ $$ = opr(';', 2, $1, $2); }
    		;
    
    case_element:
    		INTEGER ':' stmt_list					{ $$ = opr(':',2,$1,$3); }
    		;
    
    stmt_list:
              stmt									{ $$ = $1; }
            | stmt_list stmt						{ $$ = opr(';', 2, $1, $2); }
            ;
    After the OF the parser enter state 39 as expected and process to state 56 and 66 but when it gets to state 74 it doesn't reduce.
    The ':' should match a expr that causes the state machine to go to state 14.


    Relevant parts of the parser debug file

    State 74 conflicts: 1 shift/reduce

    state 39

    12 stmt: CASE expr OF . case_element_list END_CASE
    13 | CASE expr OF . case_element_list ELSE stmt_list END_CASE

    INTEGER shift, and go to state 56

    case_element_list go to state 57
    case_element go to state 58


    state 56

    20 case_element: INTEGER . ':' stmt_list

    ':' shift, and go to state 66

    state 66

    20 case_element: INTEGER ':' . stmt_list

    INTEGER shift, and go to state 4
    VARIABLE shift, and go to state 5
    PRINT shift, and go to state 6
    WHILE shift, and go to state 7
    IF shift, and go to state 8
    CASE shift, and go to state 9
    '-' shift, and go to state 10
    ';' shift, and go to state 11
    '(' shift, and go to state 12

    stmt go to state 53
    stmt_list go to state 74
    expr go to state 14

    state 74

    20 case_element: INTEGER ':' stmt_list .
    22 stmt_list: stmt_list . stmt

    INTEGER shift, and go to state 4
    VARIABLE shift, and go to state 5
    PRINT shift, and go to state 6
    WHILE shift, and go to state 7
    IF shift, and go to state 8
    CASE shift, and go to state 9
    '-' shift, and go to state 10
    ';' shift, and go to state 11
    '(' shift, and go to state 12

    INTEGER [reduce using rule 20 (case_element)]
    $default reduce using rule 20 (case_element)

    stmt go to state 60
    expr go to state 14


    The run time ooutput of the parser is.

    Shifting token ';'
    Entering state 52
    Reducing stack by rule 7 (line 63):
    $1 = token VARIABLE ()
    $2 = token ASSIGN ()
    $3 = nterm expr ()
    $4 = token ';' ()
    -> $$ = nterm stmt ()
    Stack now 0 2 9 20 39 56 66
    Entering state 53
    Reducing stack by rule 21 (line 92):
    $1 = nterm stmt ()
    -> $$ = nterm stmt_list ()
    Stack now 0 2 9 20 39 56 66
    Entering state 74
    Reading a token: --accepting rule at line 49 ("
    --accepting rule at line 20 ("1")
    Next token is token INTEGER ()
    Shifting token INTEGER ()
    Entering state 4
    Reducing stack by rule 23 (line 97):
    $1 = token INTEGER ()
    -> $$ = nterm expr ()
    Stack now 0 2 9 20 39 56 66 74
    Entering state 14
    Reading a token: --accepting rule at line 25 ("
    Next token is token ':' ()
    0: syntax error, unexpected ':' at ':'
    Error: popping nterm expr ()
    Stack now 0 2 9 20 39 56 66 74
    Error: popping nterm stmt_list ()
    Stack now 0 2 9 20 39 56 66
    Error: popping token ':' ()
    Stack now 0 2 9 20 39 56
    Error: popping token INTEGER ()
    Stack now 0 2 9 20 39
    Error: popping token OF ()
    Stack now 0 2 9 20
    Error: popping nterm expr ()
    Stack now 0 2 9
    Error: popping token CASE ()
    Stack now 0 2
    Error: popping nterm function ()
    Stack now 0
    Cleanup: discarding lookahead token ':' ()
    Stack now 0stmt:
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    May 2007
    Posts
    765
    Rep Power
    929
    Been a while since I looked at parser-generator output...take the following with skepticism.

    Does one of the stmt productions you're not showing us start with INTEGER (or a rule it invokes like a Identifier that doesn't limit its first character to alphabetics?) so the parser can't tell if you're starting a new stmt or the next case_element?
    Last edited by OmegaZero; February 17th, 2012 at 12:17 PM.
    sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0
    Originally Posted by OmegaZero
    Been a while since I looked at parser-generator output...take the following with skepticism.

    Does one of the stmt productions you're not showing us start with INTEGER (or a rule it invokes like a Identifier that doesn't limit its first character to alphabetics?) so the parser can't tell if you're starting a new stmt or the next case_element?
    Only the case_element starts with an integer and an expression. I was expecting that the INTEGER would start a new case_element state. I don't even mention expr in the production rules for the case_element and case_element_list.

    I thought about that myself so I made sure the assignment rule assigned to a lval not an expression since expressions can have a leading integer constant. I carefully limited what a lval could be. It didn't make any difference. I also tried making a dummy %noassoc terminal similar to what is done to solve the dangling IF ELSE problem.

    I think the problem is like the dangling IF ELSE problem. In this case the CASE OF statement doesn't have a nice way of making it easy to specify when the case_element ends. The is no END_CASE_ELEMENT key word or break as in C or begin - end as in Pascal.

    Perhaps if the token look ahead was two instead of one token.

    I am moving on to other parts of the compiler in hopes that I will stumble upon the answer along the way.

    The code is below. I am not that far along but I wanted to tackle the IF THEN ELSEIF END_IF and the CASE OF END_CASE problem first. The code generator itself is in a routine called ex().
    The code generator itself doesn't seem to be that bad as long as one is not looking for ultimate efficiency.

    Code:
    stmt:
              ';'									{ $$ = opr(';', 2, NULL, NULL); }
            | expr ';'								{ $$ = $1; }
            | PRINT expr ';'						{ $$ = opr(PRINT, 1, $2); }
            | lval ASSIGN expr ';'					{ $$ = opr(ASSIGN, 2, id($1), $3); }
            | WHILE expr DO stmt_list END_WHILE		{ $$ = opr(WHILE, 2, $2, $4); }
            | IF expr THEN stmt_list END_IF			{ $$ = opr(IF, 2, $2, $4); }
            | IF expr THEN stmt_list ELSE stmt_list END_IF	{ $$ = opr(IF, 3, $2, $4, $6); }
            | IF expr THEN stmt_list elseif_list END_IF		{ $$ = opr(IF, 3, $2, $4, $5); }
    		| CASE expr OF case_element_list END_CASE					{ $$ = opr(CASE,2,$2,$4); }
    		| CASE expr OF case_element_list ELSE stmt_list END_CASE	{ $$ = opr(CASE,3,$2,$4,$6); }
            ;
    
    stmt_list:
              stmt									{ $$ = $1; }
            | stmt_list stmt						{ $$ = opr(';', 2, $1, $2); }
            ;
    
    elseif_list:
    			elseif_element						{ $$ = $1; }
    		|	elseif_list elseif_element			{ $$ = opr(';', 2, $1, $2); } 
    		;
    		
    elseif_element:
    			ELSEIF expr THEN stmt_list			{ $$ = opr(ELSEIF, 2, $2, $4); }
    		|	ELSEIF expr THEN stmt_list ELSE stmt_list { $$ = opr(ELSEIF, 3, $2, $4, $6); }     
    		;		
    
    case_element_list:
    			case_element						{ $$ = $1; }
    		|	case_element_list case_element		{ $$ = opr(';', 2, $1, $2); }
    		;
    
    case_element:
    		INTEGER ':' stmt_list					{ $$ = opr(':',2,$1,$3); }
    		;
    
    lval:
    		VARIABLE				{ $$ = $1; }
    		;
    
    expr:
              INTEGER               { $$ = con($1); }
            | VARIABLE              { $$ = id($1); }
            | '-' expr %prec UMINUS { $$ = opr(UMINUS, 1, $2); }
            | expr '+' expr         { $$ = opr('+', 2, $1, $3); }
            | expr '-' expr         { $$ = opr('-', 2, $1, $3); }
            | expr '*' expr         { $$ = opr('*', 2, $1, $3); }
            | expr '/' expr         { $$ = opr('/', 2, $1, $3); }
            | expr '<' expr         { $$ = opr('<', 2, $1, $3); }
            | expr '>' expr         { $$ = opr('>', 2, $1, $3); }
            | expr GE expr          { $$ = opr(GE, 2, $1, $3); }
            | expr LE expr          { $$ = opr(LE, 2, $1, $3); }
            | expr NE expr          { $$ = opr(NE, 2, $1, $3); }
            | expr '=' expr         { $$ = opr('=', 2, $1, $3); }
    		| expr XPY expr			{ $$ = opr(XPY, 2, $1, $3); }
            | '(' expr ')'          { $$ = $2; }
            ;
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    May 2007
    Posts
    765
    Rep Power
    929
    There's your problem:

    After the '1' is read in it can either reduce the INTEGER to expr and then to stmt.

    Or it can shift expecting the ':' for the case_element.

    If you remove the expr ; production from stmt that would solve it.

    You could also check your parser generator to see if there is a way to tweak conflict resolution.

    If you could tell it to prefer shift in this case that would also work.

    Or I don't see any other place in your grammar that permits a colon to follow a integer. You could tweak your lexer to treat this as a single 'LABEL' token instead of a 'INTEGER' + ':'
    sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Thanks, that got somewhere


    I removed the expr rule as suggested and the compile process got through the parse tree generator. Great. I don't know if removing the expr rule will work in the long run but I can now work on the code generator for the CASE OF END_CASE. I tried the other suggestions and they didn't seem to work. I also tried enabling the %glr-parser. This parser is an enhancement to the default parser and apparently can scan more than one token ahead so it is smarter about when to reduce or shift. However, my Microsoft C compiler chokes on the macros in y.tab.c file that the glr parser generates. There are a lot of #ifdefs and #ifndefs. The y.tab.c code is ugly. Last week was a frustrating week. This week has got off to a good start.

    Now to generate some CASE OF END_CASE code.

    Thanks.

IMN logo majestic logo threadwatch logo seochat tools logo