#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    6
    Rep Power
    0

    Saving stdout into dictionaries


    Hi All,

    I would like to run a shell command and depending on the output - if following a certain pattern - save the results into individual sets.

    Basically it is like this:

    Run shell command (subprocess.Popen(cmd))
    Save output to variable (out, err = p.communicate())
    Process saved data into separate variables/dictionaries/lists (not sure which is best?)

    Note: Expected output can be like this (there can be several batches but with different data)

    Code:
    process results from shell... 
    Name = Andy
    Gender = Male
    UniqueID = [11:22:33:44]
    Age = 28
    
    Name = Debbie
    Gender = Female
    UniqueID = [22:33:44:55]
    Age = 29
    Driver = Yes
    
    Name = Steven
    Gender = Male
    UniqueID = [33:44:55:66]
    Age = 27
    
    Done
    I would like to be able to understand:
    Number of data sets (using above example should be =3)
    Each dataset stored and callable (So to print set.3 I expect to see:

    Code:
    Name = Steven
    Gender = Male
    UniqueID = [33:44:55:66]
    Age = 27
    Or I can use parts of each set (So to print set.3.UniqueID I expect to see:

    Code:
    33:44:55:66

    I do not want to store empty spaces and/or preceeding/following or 'unwanted' data such as:
    Code:
    process results from shell... 
    Done
    Driver = (This unwanted entry appears in second data set)
    -------------------------------------------
    Where am I now:
    I can save the output to variable but do not know how to sort it. I was looking at regex, grep etc but don't know how to make this robust. ie. it could be that 10 datasets are returned and should be saved accordingly.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    San Francisco Bay
    Posts
    1,939
    Rep Power
    1313
    You can begin with
    Code:
    lines = out.split('\n')
    This gives you the lines of output as a list, which is the first step to parsing this type of output. You probably want to store all the records in a list:
    Code:
    records = []
    Since each record spans a variable number of lines, as you iterate over the lines, you will typically need to insert data into a record that was created previously, so you can define a variable for this record. You expressed some uncertainty about the data structure to use for the individual records. I'd recommend using simple dictionaries, since they are very easy to use and flexible:
    Code:
    current_record = {}
    The idea is now to iterate through the lines and keep inserting into the current record until we hit a blank line, in which case we know the current record is complete and we can append it to the list of records.
    Code:
    for line in lines:
        if not line.strip():
            if current_record:
                records.append(current_record)
                current_record = {}
            continue
        (key, value) = line.split('=')
        current_record[key.strip()] = value.strip()
    Note the liberal use of str.strip() to get rid of leading and trailing spaces. This should get you started, but it isn't perfect. For example, the line
    Code:
        (key, value) = line.split('=')
    will raise an exception if the line isn't properly formatted. You should decide what you want to do in that case: perhaps skip the line, skip the record, or abort the program.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    6
    Rep Power
    0
    Many thanks, you've given me some good pointers of what to look at and experiment with.

IMN logo majestic logo threadwatch logo seochat tools logo