Thread: Parser tutorial

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    26
    Rep Power
    0

    Parser tutorial


    i have an interest to program parsers. I want to program my own expression parser.but i dont find a good online resource to begin with parser tutorial in c++.they directly show the source code and i have to just copy and pase the code in my vc++and compile and run it. But i cant understand the standard algorithm for parsing.i also cant feel the joy of programming when i do that.so i please you to suggest a best tutorial to your point of view.i like to begin with expression parsing. What i am comming to say is i dond need the code for expression parser but i need an idea about parsing.i already know c and c++
  2. #2
  3. Transforming Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,143
    Rep Power
    9398
    Grab a book on it. Compilers are a complicated subject so if you find anything that advertises itself as a tutorial on parsing then they're not telling you the whole story.

    For example I have this one.
  4. #3
  5. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,184
    Rep Power
    2222
    I agree with requinix. This is not a trivial topic. Books on writing compilers would be a good source, though many of them will get more into the theory of different kinds of parsers and not offer enough practical examples. A book I recently worked through was different. Writing Compilers and Interpreters by Ronald Mak went through the design and development of a Pascal compiler written in C++. Its weakness was that it just chose one parsing method, top-down (recursive descent), which is not considered as good as others. Also, although it's a thick book, most of it is code listings with mostly superficial discussion of the code (the files are available for download at ftp://ftp.wiley.com/public/computer_...ting_Compilers -- hopefully they're still there; I had downloaded mine several years ago). I wouldn't particularly recommend it, but it would give you some practice to start out with. And since most of the book is code listings anyway, downloading the code might help; since the code was meant for teaching, it should be commented better than what you've already downloaded. Though I had to change a couple things to get it to compile under Visual Studio 2008.

    Also, there are software tools for creating parsers, such as flex and bison or lex and yacc (different versions of the same thing). I'm not very familiar with them, but others here are. The thing is that you would need to create a set of syntax rules. If you have access to Kernighan and Ritchie's The C Programming Language, then the rules of C grammar are given in Appendix A13. Or you could find them on-line. You would need to be familiar with Backus-Naur Form (BNF).


    In Mak's book, he started with the scanner that would read the source line by line, character by character, and tokenize the source. In terms of your expression parser, that means that you would create special enums or macros for the operators (I would prefer enums), so that if an operator or grouping symbol (ie, parenthesis) is found, then you would replace it with its corresponding token enum. The scanner/tokenizer would act as a filter which reads the source and outputs a tokenized version of the source, in which the literal constants would have passed through and the operators, grouping symbols, and reserved words would have been replaced by their tokens. You would also place variable names and literal constants into a symbol table or similar structure, replacing them with a token that would refer to the proper entry in the table -- this could either be done during the tokenizing or in a subsequent step; I'd opt for doing it in the tokenizer. Also please note that all white space would have been stripped out of the output. Also, assuming that you had built a symbol table, this output could be used by an interpreter to "run" this code.

    The parsing step would then analyze the tokenized output for correctness, for which it would use the language's grammar rules. Mak's approach involved recursive function calls, whereas other parsers would undoubtedly use data structures such as trees; that's where Mak's approach started getting clunky.

    In an expression parser, I would want to convert the expression from infix notation to postfix notation, since it's easy to evaluate postfix. For example, this expression, a * b, would be written in postfix as a b * . You would evaluate that on a stack as push(a), push(b), pop the top two values from the stack, multiply them together, and push the result onto the stack. Another example of infix to postfix conversion would be 42 + (a + b) * (c - d) , which in postfix would be 42 a b + c d - * + .

    The method that I am familiar with to convert from infix to postfix is the shunting-yard algorithm.

    Then to evaluate a postfix expression using a stack, you simply take one token at a time in order. If it's a value, then you push that onto the stack. If it's an operator, then you pop however many operands it calls for off the stack, perform the operation, and push the result onto the stack.

    When I did this in school (1979, using PL/I), the shunting algorithm took me the normal amount of work to get working, but then the expression evaluation program worked perfectly first time as did merging both programs together. Which is to say that it's not difficult.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    26
    Rep Power
    0

    parser tutorial


    thank you for ur great explanation.i will start my voyage on parsers tomorrow morning with ur link
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,905
    Rep Power
    481
    Read the bison and flex manuals.

    At this post I have a parser/evaluator for php. Handles a LITTLE BIT of php, including an if statement. The parser builds and evaluates a parse tree.http://forums.devshed.com/showthread.php?p=2840714#post2840714
    see attachment at post 6.

    The shunting yard algorithm is expressed in many languages at
    http://rosettacode.org/wiki/Parsing/Shunting-yard_algorithm

    I wrote the j implementation about a year and a half ago.

    And you might search rosettacode.org for parser.
    [code]Code tags[/code] are essential for python code and Makefiles!
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,184
    Rep Power
    2222
    Are you familiar with the UNIX/Linux utility, bc (binary calculator)? For example:
    C:TEST>bc
    bc 1.06
    Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
    This is free software with ABSOLUTELY NO WARRANTY.
    For details type `warranty'.
    x=4
    y=42
    x*y
    168
    quit

    C:TEST>
    Just mentioning it as a likely example of a possible project to practice implementing expression parsing. It's always good to have a project to work on when you're learning something. Keeps it interesting, keeps you motivated, and it keeps you from taking the lazy way out all the time.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    26
    Rep Power
    0
    i dont know u/l utility but i know the same kind of thing in matlabs.they both are quite similar from my point of view.and i will try to learn more about bnf xbnf and shunt_yard algorithm.and thank you very much for spending valuable time for my small question.thank you very much.i will try to post my own expression parser or derivativecalculator in c++ within a week or a two.i actually started this project for finding the derivative.then i came to know that i need to know about parsers and all,.so it kept me interested and motivated at all the time.i will google about those algorithm
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,905
    Rep Power
    481
    There are programs that read c functions and write a c function that computes the derivative. I cannot point you to one of these any better than you could by internet search. Learning about parsing is useful, you'll be able to write a mini-language for some application. If your interests lie in other directions then don't reinvent the wheel. Then again, you might develop a rounder one.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo