November 4th, 2006, 04:29 PM
I am having a problem with lex. In the main section with the regular expressions and the code for them, I am having an issue recognizing tokens properly.
First in the list of regular expressions is a few tokens e.g.
SURFACE return SURFACETOKEN;
DESCRIPTION return DESCRIPTIONTOKEN;
Then there is a normal text expression:
([\(\)_''/,&a-zA-Z0-9.-]+[ ]*)+ yylval=strdup(yytext); return TEXT;
I am having a problem where "SURFACE" is being read as "TEXT", which messes my parser up. I know that SURFACE would qualify for the TEXT regular expression, but I had thought that lex would choose the one that came first in the list. Apparently, not. Strangely, sometimes the tokens are read properly, but for the majority of the input file I am using, they are not.
How can I deal with this? Is there any way to tell lex that one expression should be judged before another? Or an efficient way to tell lex that the tokens do not qualify for the normal text expression. I know I could do this in yacc, but I'd prefer to do it in lex, and it seems like it should be possible.
I'd appreciate any advice.