#1
  1. The bad and the ugly...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2007
    Location
    Oz... No??? Neverland then?
    Posts
    142
    Rep Power
    0

    State machine parser


    Hello all,

    For my final homework assignment our instructor has us writing a simple byte parser. the attached file (which needs to be renamed *.bin) is the input. For some reason, my parser seems to be grabbing something at the end of the file and passing it to my parse() method thus outputting junk at the end.

    The input has a standard format of
    Code:
    Start->Type->Size(low)->Size(high)->Payload->Checksum
    Theres 4 different Types for the input file (ASCII, Integer, Float and Double) but I'm just working on ASCII right now.

    my parser class:
    Code:
    #include "parser.h"
    
    parser::parser()
    {
    	this->State = start;
    	i = 0;
    	j = 0;
    }
    
    parser::~parser()
    {
    
    }
    
    bool parser::parse(unsigned char byte)
    {
    	switch(this->State)
    		{
    			case start:
    				if (byte = 170)
    					{
    						Chk = 170;
    						State = type;
    						break;
    					}
    				
    				
    			case type:
    				Type = byte;
    				Chk += byte;
    				State = size_low;
    				break;
    					
    			case size_low:
    				Size = byte;
    				Chk += byte;
    				State = size_high;
    				break;	
    				
    			case size_high:
    				Size += (byte << 8);
    				Chk += byte;
    				State = payload;
    				break;
    				
    			case payload:
    				if (i == Size)
    					{
    						State = checksum;
    						break;
    					}
    					
    				Data[i] = byte;
    				Chk += byte;
    				i++;
    				break;
    				
    			case checksum:
    				if (byte = Chk) 
    					{
    						State = start;
    						return true;
    					}
    				else 
    					{
    						State = start;
    						break;
    					}
    		}
    	return false;
    }
    
    
    void parser::process(void)
    {
    	switch(Type)
    		{
    			case 0:
    				char *pa;
    				pa = (char*)(Data);
    					for(j; j < Size; j++)
    						{
    							std::cout << *pa << std::endl;
    							*pa++;
    						}
    					
    			case 1:
    				int *pb;
    				pb = reinterpret_cast<int *>(Data);
    				for(j; j < Size; j++)
    						{
    							std::cout << *pb;
    							*pb++;
    						}
    						
    			case 2:
    				float *pc;
    				pc = reinterpret_cast<float *>(Data);
    				std::cout << *pc;
    
    			case 3:
    				double *pd;
    				pd = reinterpret_cast<double *>(Data);
    				for (j; j < Size; j++)
    					{
    						std::cout << *pd;
    						*pd++;
    					}	
    		}
    }
    my header file:
    Code:
    #include <iostream>
    #include <string>
    
    // Definition for Magic numbers
    #define MAX_SIZE 1024
    
    // ParseState are the different states in your parser 
    enum ParseState{start, type, size_low, size_high, payload, checksum};
    
    // The different types of payload
    enum TypeID{asciiType, integerType, floatType, doubleType};
    
    // Parser class definition
    class parser{
    
    	private:
    		
    		ParseState State;				// Holds the current state of the parser
    		int i;
    		int j;
    		unsigned int Size; 				// Holds the size of the payload
    		unsigned char Type;				// Holds the type of the payload
    		unsigned char Chk;				// Holds the running checksum
    		unsigned char Data[MAX_SIZE];	// Holds the payload data
    		
    		
    	public:
    		// Constructor
    		parser();
    		
    		// Destructor
    		~parser();
    		
    		// Parse function, 
    		// Argument is a byte out of the input file
    		// Return value is true when packet has been completely parsed and checksum is correct
    		bool parse(unsigned char);
    		
    		// Process payload function
    		void process(void);
    		
    };
    and my main:
    Code:
    #include <fstream>
    #include "parser.h"
    
    using namespace std;
    
    
    int main (int argc, char *argv[])
    {
    	ifstream filename;
    	parser funk;
    	unsigned char byte;
    	
    	//if user calls program from cmd prompt incorrectly
    		if((argc < 2)) 
    			{
    				cout << "This program will parse a .bin file and print payload\n";
    				cout << "\n";
    				cout << "Usage: hw8 [input filename].bin\n";
    				exit(0);
    			}
    	
    	filename.open(argv[1], ios::binary); //open file in binary mode
    		if ( !filename )
    			{
    				cout << "File does not exist.\n";
    				exit(0);
    			}
    	
    	while (!filename.eof())
    		{
    			byte = filename.get();
    			//cout << "byte :" << hex << (int)byte << endl;
    			if (funk.parse(byte) == true)
    				funk.process();
    		}
    	filename.close();
    }


    When you run it at command line with
    Code:
    hw8 testASCII.bin
    The output is
    Code:
    Hello World!1.14314e+027
    The only thing I can think is that my while loop in my main is passing something to parse() that it shouldn't but I haven't been able to figure out why. Am I not using reinterpret_cast correctly?
    Attached Files
    "Life is not a journey with the intent on arriving at the finish line in a pretty and well preserved body. But rather to skid in broadside, totally worn out, thoroughly used up and loudly proclaiming, "Wow! What a ride!" -Anonymous
    Halo! || Diablo 2 LOD Modding || OLGA's BACK!
  2. #2
  3. The bad and the ugly...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2007
    Location
    Oz... No??? Neverland then?
    Posts
    142
    Rep Power
    0
    headway!

    Okay, so if i modify my while loop in main to print out byte each time through
    Code:
    while (!filename.eof())
    		{
    			byte = filename.get();
    			
    			cout << "byte :" << hex << (int)byte << endl;
    			if (funk.parse(byte) == true)
    				funk.process();
    		}
    The output is
    Code:
     
    byte :aa
    byte :0
    byte :c
    byte :0
    byte :48
    byte :65
    byte :6c
    byte :6c
    byte :6f
    byte :20
    byte :77
    byte :6f
    byte :72
    byte :6c
    byte :64
    byte :21
    byte :13
    byte :ff
    c
    Hello world!1.14314e+027
    "Life is not a journey with the intent on arriving at the finish line in a pretty and well preserved body. But rather to skid in broadside, totally worn out, thoroughly used up and loudly proclaiming, "Wow! What a ride!" -Anonymous
    Halo! || Diablo 2 LOD Modding || OLGA's BACK!
  4. #3
  5. The bad and the ugly...
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2007
    Location
    Oz... No??? Neverland then?
    Posts
    142
    Rep Power
    0
    figured it out!
    "Life is not a journey with the intent on arriving at the finish line in a pretty and well preserved body. But rather to skid in broadside, totally worn out, thoroughly used up and loudly proclaiming, "Wow! What a ride!" -Anonymous
    Halo! || Diablo 2 LOD Modding || OLGA's BACK!
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    3
    Rep Power
    0

    State machine parser


    This is in contrast to the parsing-theory origins of the term finite-state machine where the machine is described as consuming characters or tokens.

IMN logo majestic logo threadwatch logo seochat tools logo