#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    3
    Rep Power
    0

    Flex/Bison details and error handling.


    Hi,
    I have inherited an assembler in Flex/Bison, and it all compiles and works, and I can make simple changes to the build.

    However, the error handling is less than ideal, and I have found a truly strange action in the flex coded conditional handling.

    I can find the source lines for IFDEF/ELSE/ENDIF, and nothing leaps out, but I am no .l expert.

    Take the example, of what lexer.l swallows aka std asm.:

    ASM Code:
    ;TestTrue EQU 1 
    LLif     EQU  30H   ; FAILS in some contexts only 
    LLi_f     EQU  30H   ; OK                 
         MOV     R1,LLif   ; fine outside IFDEFs  
    IFDEF TestTrue  ; something fishy here - parser triggers on EMBEDDED if - inside another word  ?!                 
         MOV     R1,LLi_f  ; this one fails if spelt LLif   
    ELSE                 
        MOV     R1,LLif   ; this one is OK 
    ENDIF


    The Trigger conditions are the word if (which is a valid operator) must be inside a conditional block, and actually inside the non-active area, and only in that 'zone' does it change its neurons, and sees any if at all, even one contained inside a longer word, as being another conditional level.

    Remove the spelling of if and it all work 100%, add back the LLif , (or Life, or Fifo ) and it fails, and rather ungracefully.

    I've tried different versions of Flex/Bison, and they seem to make no difference.

    So the issue must be something in the lexer.l, but what can do something as basic as flip to consider sub-strings, only in some areas ?

    I can see blocks called
    <COND_SKIP> and <COND_SCAN> which seem to manage the Conditionals, but nothing that says 'when inside a non-active conditional block, change keyword rules from whole word, to substring' - besides, this behaviour is not something anyone would code deliberately, it has to be an accident

    Does anyone know how to .l code specifies whole words, or substring matches ?

    Or, does anyone know of an example assembler, or just a IFDEF style conditional processor, in lexer.l format ?
    If I can find simple example code that does not do this, I can try spot the difference.

    I know it triggers the conditional parser, as if I write this {nonsense}, it Assembles with no errors ?!
    ASM Code:
     
    ;TestTrue EQU 1
    LLif     EQU  30H   ; FAILS in some contexts only
    LLi_f     EQU  30H   ; OK
                    MOV     R1,LLif   ; fine outside IFDEFs
     
      IFDEF TestTrue  ; something fishy here - parser triggers on EMBEDDED if ?!
                    MOV     R1,LLif  ; this one fails if spelt LLif
                                 ENDIF ; nonsense line matches phantom if -> works 100%
      ELSE
                    MOV     R1,LLif   ; this one is OK
      ENDIF
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,996
    Rep Power
    481
    I suggest you post the flex and bison files. From there we can ask further relevant questions.
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    3
    Rep Power
    0
    Originally Posted by b49P23TIvg
    I suggest you post the flex and bison files. From there we can ask further relevant questions.
    Thanks, I think I have narrowed it down to these chunks below.

    If I have this right then this line

    <ASM>IFNB { BEGIN BLANK; return IFNB; }

    applied the rules in <ASM> until it finds "IFNB" then returns - but is that return via a call to label IFNB, or just text "IFNB"
    - I can only find IFNB within a block called
    <COND_SCAN>, where it seems to be 'text match' not a label ?
    (see below)
    Or are text matches in quotes only, and anything else is a public label ?


    My guess is the first rule branch is OK, but then the rules applied while looking for the matching COND, are too lax, and do not require whole words ?

    - but I also would have expected the basic 'whole words' stuff to be common across most tests ?


    FLEX Code:
     
    <COND_SCAN>{
    [ \t\r]+		; /* ignore  whitespace */
    \;.*			;   /* ignore comments */
    "//".*			; /* ignore comments */
    #IF		{ condp = push_cond(condp, S_IGNORE, T_CPRE); }
    #IFDEF		{ condp = push_cond(condp, S_IGNORE, T_CPRE); }
    #IFNDEF		{ condp = push_cond(condp, S_IGNORE, T_CPRE); }
    \.IF		{ condp = push_cond(condp, S_IGNORE, T_DOT); }
    IF		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    IFB		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    IFN		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    IFNB		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    IFDEF		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    IFNDEF		{ condp = push_cond(condp, S_IGNORE, T_MCS); }
    #ELIF		{ if ((condp == NULL) || (condp->type != T_CPRE)) {
    		    do_err("No matching #IF", yylloc, 43, 1);
    		  }
    		  else if (condp->state == S_FALSE) {
    		    yylineno--; yyless(0);
    		    BEGIN ASM;
    		  }
    		}
    /* etc */


    and also these 'starter' paths

    Flex Code:
     
    /* plenty of these */
    <ASM>IFN	{ return IFN; }
    <ASM>IFNB	{ BEGIN BLANK; return IFNB; }
    <ASM>IFDEF	{ return IFDEF; }
     
    /* Which I think apply this ? */
     
    <ASM>{
    \'[^'\n]*\'	{ if (yytext[yyleng-1] != '\'') {
    		    do_err("String not terminated", yylloc, 33, 1);
    		  } else if (yytext[yyleng-2] == '\\') {
    		    yyless(yyleng-1);
    		    yymore();
    		  } else {
    		    if (yyleng == 3) {
    		      yylval.value = yytext[1];
    		      return NUMBER;
    		    } else if (yyleng == 4) {
    		      yylval.value = (yytext[1] <<8) | yytext[2];
    		      return NUMBER;
    		    } else {
    		      yylval.str = do_ascii(yytext, '\''); //strdup(yytext);
    		      return STRING;
    		    }
    		  }
    		}
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,996
    Rep Power
    481
    Gosh I see many return statements in the flex source. To where might this function return?
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    3
    Rep Power
    0
    Originally Posted by b49P23TIvg
    Gosh I see many return statements in the flex source. To where might this function return?
    If I understand Flex right, they return to a after the
    BEGIN ASM;
    - however the coarse, flow of the sw is fine, I am looking for something subtle, as to how a skipping block, flips from whole-word tests to within-word finds... ?
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,996
    Rep Power
    481
    The return statements return tokens to the bison parser. For complex error handling you need to understand the interaction between the parser and lexical analyzer.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo