#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Writing a Structured Text IEC 61131-3 Compiler


    The problem i have is I get a lot of reduce/reduce errors when declaring variables of user defined types. I am using Flex and Bison. The problem is in the VAR block where I can declare a list of variables and assign a type. There isn't a problem if the type is a basic type like INT but if I use user defined types then Bison can't tell the difference between an arrary_name_type like MY_ARRAY, structure_type_name, enum_type_name, subtrange_type_name or simple_type_name and gives up. I have GLR enabled but still no joy. I can work around this problem by enhancing Flex to return a different token than IDENTIFIER when it see that the TYPE_NAME has already been defined and is known. For instance in the VAR block MY_ARRAY has already been defined so I can return an ARRAY_ID. I can do the same for the other categories of type names but I was hoping there is a cleaner way. I tried the idea for in the Bison manual for resolving mysterious conflicts but it didn't work. Any ideas? Thanks for looking.

    Code:
    TYPE
        MY_ARRAY : ARRAY[1..10] OF INT;
    END_TYPE
    
    PROGRAM TEST
    	VAR
    		B: MY_ARRAY;  // THIS IS THE PROBLEM
    		I,J,K: INT;
    		A: INT;
    	END_VAR
    A:=B[I+J];
    END_PROGRAM
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    May 2007
    Posts
    765
    Rep Power
    929
    Can you post the relevant parts of your grammar & lexer and any details bison gives you related to the conflict?
    sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    relevant part of my flex and bison code


    Below is what I have done to fix my problem. It works but I haven't seen others do it this way. Originally the Flex {identifier} rule just returned IDENTIFIER but there are too many places in the Bison file where variables can be declared
    Edit, I tried reformatting but it didn't work.

    Code:
    var1_init_decl:
        var1_list ':' spec_init_opt
    		{ $$ = opr(VAR1_INIT_DECL,2,$1,$3);
    		printf("\nAllocate Here!\n\n"); }
    	;
    Where var1_list is a list of variables to be declared and spec_init_op is a type name. I am just showing the declarations for the simple types. When array_type_name and structure_type name return only IDENTIFIER too Bison gets confused. Originally all the type_names just returned IDENTIFIER but now I return a token that distingquishes the type if I see the type_name is already defined. It works but it doesn't look like any thing I have
    seen before.

    Code:
    // in the Flex file
    {identifier}	    {	yylval.sIndex = HashId(yytext,&used);
    			if ( SymTbl[yylval.sIndex].type >= TYP_LAST )
    				switch ( TypTbl[SymTbl[yylval.sIndex].type].GenType )
    				{
    				case SimpleEnum:
    					return ID_SIMPLE;
    				break;
    				case SubRngEnum:
    					return ID_SUBRNG;
    				break;
    				case EnumEnum:
    					return ID_ENUM;
    				break;
    				case ArrayEnum:
    					return ID_ARRAY;
    				break;
    				case StructEnum:
    					return ID_STRUCT;
    				break;	
    				case FCEnum:
    					return ID_FC;
    				break;
    				case FBEnum:
    				return ID_FB;
    				break;
    			        }
    			else
    				return IDENTIFIER;
    		}
    
    // in the Bison file				
    				
    single_element_type_name:				// <nPtr>
    		simple_type_name				{ $$ = $1; }
    	|	subrange_type_name				{ $$ = $1; }
    	|	enumerated_type_declaration		{ $$ = $1; }
    	;
    
    simple_type_name:						// <nPtr>
    		IDENTIFIER						{ $$ = opr(SIMPLE_TYPE_NAME,1,id($1)); }
    	|	ID_SIMPLE						{ $$ = opr(SIMPLE_TYPE_NAME,1,id($1)); }
    	;
    	
    subrange_type_name:						// <nPtr>
    		IDENTIFIER						{ $$ = opr(SUBRANGE_TYPE_NAME,1,id($1)); }
    	|	ID_SUBRNG						{ $$ = opr(SUBRANGE_TYPE_NAME,1,id($1)); }
    	;
    	
    enumerated_type_name:					// <nPtr>
    		IDENTIFIER						{ $$ = opr(ENUMERATED_TYPE_NAME,1,id($1)); }	
    	|	ID_ENUM							{ $$ = opr(ENUMERATED_TYPE_NAME,1,id($1)); }	
    	; 	
    		
    array_type_name:						// <nPtr>
    		IDENTIFIER						{ $$ = opr(ARRAY_TYPE_NAME,1,id($1)); }
    	|	ID_ARRAY						{ $$ = opr(ARRAY_TYPE_NAME,1,id($1)); }
    	;
    		
    structure_type_name:					// <nPtr>
    		IDENTIFIER						{ $$ = opr(STRUCTURE_TYPE_NAME,1,id($1)); }
    	|	ID_STRUCT						{ $$ = opr(STRUCTURE_TYPE_NAME,1,id($1)); }
    	;
    
    var1_init_decl:
    	var1_list ':' spec_init_opt
    										{ $$ = opr(VAR1_INIT_DECL,2,$1,$3);
    											printf("\nAllocate Here!\n\n"); }
    	;
    
    var1_list:
    		variable_name					{ $$ = $1; }
    	|	var1_list ',' variable_name		{ $$ = opr(VAR1_LIST,2,$1,$3); }
    	;
    
    array_var_init_decl:
    		var1_list ':' array_spec_init	{ $$ = opr(ARRAY_VAR_INIT_DECL,2,$1,$3); }
    	;
    	
    spec_init_opt:
    		simple_spec_init				{ $$ = $1; }
    	|	subrange_spec_init				{ $$ = $1; }
    	|	enumerated_spec_init			{ $$ = $1; }
    	;
    Last edited by pnachtwey; July 17th, 2012 at 11:01 AM. Reason: Formatting
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Well now I know that doesn't work.


    It worked well enough to get the array declarations done but I moved on to the structure declarations and it looks like Bison is giving up after creating four stacks. The basic problem is reduction/reduction errors

    Here is part of the Bison output file. Since the TYPE END_TYPE is declared early I can use FLEX to use the ID_SIMPLE, ID_ENUM .. type of tokens and only know the variable is returns IDENTIFICATION.

    state 1

    68 data_type_declaration: TYPE . type_declaration_list END_TYPE

    IDENTIFIER shift, and go to state 6
    ID_SIMPLE shift, and go to state 7
    ID_SUBRNG shift, and go to state 8
    ID_ENUM shift, and go to state 9
    ID_ARRAY shift, and go to state 10
    ID_STRUCT shift, and go to state 11

    simple_type_name go to state 12
    subrange_type_name go to state 13
    enumerated_type_name go to state 14
    array_type_name go to state 15
    structure_type_name go to state 16
    type_declaration_list go to state 17
    type_declaration go to state 18
    single_element_type_declaration go to state 19
    simple_type_declaration go to state 20
    subrange_type_declaration go to state 21
    enumerated_type_declaration go to state 22
    array_type_declaration go to state 23
    structure_type_declaration go to state 24

    state 6

    58 simple_type_name: IDENTIFIER .
    60 subrange_type_name: IDENTIFIER .
    62 enumerated_type_name: IDENTIFIER .
    64 array_type_name: IDENTIFIER .
    66 structure_type_name: IDENTIFIER .

    IDENTIFIER reduce using rule 58 (simple_type_name)
    IDENTIFIER [reduce using rule 60 (subrange_type_name)]
    IDENTIFIER [reduce using rule 62 (enumerated_type_name)]
    IDENTIFIER [reduce using rule 64 (array_type_name)]
    IDENTIFIER [reduce using rule 66 (structure_type_name)]
    ':' reduce using rule 58 (simple_type_name)
    ':' [reduce using rule 60 (subrange_type_name)]
    ':' [reduce using rule 62 (enumerated_type_name)]
    ':' [reduce using rule 64 (array_type_name)]
    ':' [reduce using rule 66 (structure_type_name)]
    ASSIGN reduce using rule 58 (simple_type_name)
    ASSIGN [reduce using rule 60 (subrange_type_name)]
    ASSIGN [reduce using rule 62 (enumerated_type_name)]
    ASSIGN [reduce using rule 64 (array_type_name)]
    ASSIGN [reduce using rule 66 (structure_type_name)]
    END_STRUCT reduce using rule 58 (simple_type_name)
    END_STRUCT [reduce using rule 60 (subrange_type_name)]
    END_STRUCT [reduce using rule 62 (enumerated_type_name)]
    END_STRUCT [reduce using rule 64 (array_type_name)]
    END_STRUCT [reduce using rule 66 (structure_type_name)]
    ';' reduce using rule 58 (simple_type_name)
    ';' [reduce using rule 60 (subrange_type_name)]
    ';' [reduce using rule 62 (enumerated_type_name)]
    ';' [reduce using rule 64 (array_type_name)]
    ';' [reduce using rule 66 (structure_type_name)]
    $default reduce using rule 58 (simple_type_name)
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    May 2007
    Posts
    765
    Rep Power
    929
    I think I see the problem.

    Each of your data type rules (simple_type_name, subrange_type_name, etc.) can be satisfied with an IDENTIFIER. You know which one should be used by looking at the definition of the custom data type. Bison & Flex don't have that information so MY_ARRAY looks the same as MY_ENUM.

    The root problem is you're trying to solve everything at the parser/lexer stage, when the information you need isn't available yet.

    If you leave the resolution of user-defined data types until the semantic analysis you should be fine. Here's a brief psuedo-code example of how I'd tackle the problem:

    Code:
    Lexer:
    
    Treat names of data types like "INT" as "INDENTIFIER".
    
    
    Parser:
    
    data-type
        = IDENTIFIER
          { SimpleType( $1 ) }
        | ARRAY [ INTEGER .. INTEGER ] OF IDENTIFIER
          { ArrayType( $3, $1, $2 ) }
    
    
    Semantic Analyser:
    
    make-variable( symbol, data-type )
        case data-type of
            SimpleType( id ):
                if built-in-data-types contains id
                    allocate( symbol, built-in-data-type(id) )
                else if user-data-types contains id
                    make-variable( symbol, user-data-type(id) )
                else
                    error "Undefined data type"
            ArrayType( base-type, start, end ):
                if built-in-data-types contains base-type
                    allocate-array( symbol, built-in-data-type(id), start, end )
                else if user-data-types contains base-type
                    make-variable( symbol, ArrayType( base-type, start, end ) )
                else
                    error "Undefined data type"
    sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Thanks, I am studying your reply


    For some reason I didn't get an e-mail saying anybody had replied. I just looked at saw your new post.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    This isn't covered in the books.


    Originally Posted by OmegaZero
    The root problem is you're trying to solve everything at the parser/lexer stage, when the information you need isn't available yet.
    I am finding out this is true on a few other problems. I am also going to delay putting IDENTIFIERS in they symbol table as I really don't don't know if thay are global or part of a structure until later.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2012
    Posts
    9
    Rep Power
    0

    Update


    I managed to solve my main two problems with shift/reduce and reduce/reduce errors and the even worse ambiguous grammar errors.

    The shift/reduce and reduce/reduce warnings still exist but I enabled the GLR parser. This will enable Bison to search many different paths. Since Structured Text has Simple, Sub Range, Enumerated, Array and Structured data types the GLR parser will create up to 5 different parse stacks in order until it can figure out which one to keep.

    The ambiguous errors are caused when the parser can get to the same state by two different paths. This problem was solved by using the %dprec or dynamic precedence feature.

    My parser works now. The actions mostly work. The code generator is generating pseudo assembly. I can generate code for arrays of structures of arrays of structures. The error reporting is making use of yylloc to pin point errors.

    I knew squat when I started a few months ago. The text books aren't much help as they either state the obvious or cover what Bison and Flex do. Neither are college classes that don't get past writing a calculator. If I ever write another compiler it will be MUCH easier.

IMN logo majestic logo threadwatch logo seochat tools logo