Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0

    [BASH] Issue with parsing line from file


    Hi,

    I am trying to parse a line from a file and place the values into separate variables:

    input.txt:
    Code:
    CreateVegaFeed-20110928-before-skip-start
    this is the code i have done so far:

    Code:
    $input_file="input.txt"
    INPUT_FILE=`cat $input_file` 
    
    for i in $INPUT_FILE
    do
        #parse current line from input file into appropriate variables 
        IFS='-';read currentJBPM time_stamp report_type skip_node start_node <<<$i
    
    #do stuff with the variables
    done
    The issue is that the values in the file are not being inputted into the variables, expect for the first value: CreateVegaFeed

    This is the output of the code with set -x

    Code:
    ++ cat input.txt
    + INPUT_FILE='CreateVegaFeed - 20110928 - before - skip - start '
    + for i in '$INPUT_FILE'
    + IFS=-
    + read currentJBPM time_stamp report_type skip_node start_node
    + echo process: CreateVegaFeed
    process: CreateVegaFeed
    + echo timestamp:
    timestamp:
    + echo report type:
    report type:
    + echo skip:
    skip:
    + echo start:
    start:
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,477
    Rep Power
    1752
    The output from the set -x is suggesting that you have extraneous white space in the line shown from the input file, which makes me think that the for loop is presenting one word at a time.
    Is that all the output from the command as I think I'd expect the for loop to iterate a few more times.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  4. #3
  5. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,966
    Rep Power
    481

    learn gawk, it will pay off big time if this sort of issue recurs for you.


    Code:
    $ gawk -F- '{for(i=1;i<=NF;++i){printf "field %d: %s\n",i,$i}}'
    a-b-cde - efg
    field 1: a
    field 2: b
    field 3: cde 
    field 4:  efg
    ^D
    The field separator is a regular expression. You could handle the possible space problem with

    $ gawk -F' *- *' 'gawk program that often fits on a line' input.txt
    Last edited by b49P23TIvg; September 28th, 2011 at 03:28 PM. Reason: bit-o-format
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,477
    Rep Power
    1752
    Yes but ... that doesn't actually help assign those values to variables outside the [gn]awk script, does it?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,966
    Rep Power
    481

    say what?


    $ shell_variable=$( gawk 'program' input files here )
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0
    Thanks guys it was very helpful!

    i figured out that it can also be done the following way:


    Code:
    INPUT_FILE=`cat $input_file` 
    
    while read -r i
    do
        #parse current line from input file into appropriate variables 
        read currentJBPM time_stamp report_type skip_node start_node <<<$(IFS="-";echo $i)
        #echo process: $currentJBPM
        #echo timestamp: $time_stamp
        #echo report type: $report_type
        echo skip: $skip_node
        echo start: $start_node
    done <<< "$INPUT_FILE"
    the values are getting stored to the correct variables. The only issue I am having is with space. Say forexample the value supposed to be stored into $skip_node contains a value such as this:

    Code:
    "val1 val2"
    when I rund the code, val1 gets stored to $skip_node and val2 gets stored to $start_node. hence I am thinking the IFS is not being interpreted properly.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,477
    Rep Power
    1752
    What, precisely, is the format (including spaces, tabs, etc.) of the input file?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,966
    Rep Power
    481

    good question


    It's looking like akaballa123 needs to parse strings as a single token. Stick with gawk. And you're right, I didn't read the original question carefully.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0
    hmm, I tried your code and I assigned the output to the necessary vartiables. but I am getting a gawk not found error:

    Code:
    ./createlist.sh: line 63: gawk: command not found
    this is the modification I made:

    Code:
     set -- `gnawk -F' *- *' '{for(i=1;i<=NF;++i){printf "field %d: %s\n",i,$i}}' $INPUT_FILE`
        currentJBPM=$1
        time_stamp=$2
        report_type=$3
        skip_node=$4
        start_node=$5
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0
    Originally Posted by akaballa123
    hmm, I tried your code and I assigned the output to the necessary vartiables. but I am getting a gawk not found error:

    Code:
    ./createlist.sh: line 63: gawk: command not found
    this is the modification I made:

    Code:
     set -- `gnawk -F' *- *' '{for(i=1;i<=NF;++i){printf "field %d: %s\n",i,$i}}' $INPUT_FILE`
        currentJBPM=$1
        time_stamp=$2
        report_type=$3
        skip_node=$4
        start_node=$5
    sorry i put gnawk...but i did try with gawk and it did not work. I believe that my system does not support gawk. So I tried with nawk. nawk command is being recognized but the fields are not getting parsed properly.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0
    Originally Posted by akaballa123
    sorry i put gnawk...but i did try with gawk and it did not work. I believe that my system does not support gawk. So I tried with nawk. nawk command is being recognized but the fields are not getting parsed properly.
    The other thing is, I might have more than 1 line in the input file. So I want to read one line at a time, then do some work with the fields in the current line then move to the next line. So I am not sure if nawk/awk is right for me.
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,477
    Rep Power
    1752
    There are a few variations of the awk program - the original one, named, funnily enough, awk. Then there is gawk (GNU awk), also nawk (I think named for new awk). When I said [gn]awk I was trying to refer to them all as one ... too clever by half, I fear!
    Keeping to awk (probably safest as most/all installs of *nix will have that).
    awk, in any of it's forms will process a file, one line at a time, 'chopping' up each line into fields. This is what you seem to want. Thus, what we need to do is found out what unique character in your file denotes a 'break' between each field as you wish to see them.
    From an earlier post your input file looks to have a lines set up the following:
    Code:
    + INPUT_FILE='CreateVegaFeed - 20110928 - before - skip - start '
    Then you mention that the 4th field (skip) can comprise two words. Things get 'confused' whrn spaces start appearing as various programs assume that a space denote a 'break' in the flow.
    I think that your initial premise to use the hyphen as a field separator is going to be the best idea. That any version of awk can do for you in this sort of situation will be the ability to 'pluck out' each field, one at a time. As you are going to have to do this one line at a time we will need to 'cheat' by having each line of the file read into a variable, and then pass that variable to awk to read as standard input:
    Code:
    while read file_line
    do
      currentJBPM=$(echo "$file_line" | awk -F\- '{print $1}')
      time_stamp=$(echo "$file_line" | awk -F\- '{print $2}')
      report_type=$(echo "$file_line" | awk -F\- '{print $3}')
      skip_node=$(echo "$file_line" | awk -F\- '{print $4}')
      start_node=$(echo "$file_line" | awk -F\- '{print $5}')
    done < $input_file
    I've not got a *nix box handy at the moment so that is untested.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    23
    Rep Power
    0
    Am really confused now . I keep getting this error:

    Code:
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null = null ']'
    ./createlist.sh: line 112: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null = null ']'
    ./createlist.sh: line 129: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 147: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 152: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 159: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 165: [: too many arguments
    The variables get assigned the proper string and everything except the if statments are working fine. I am really confused why its causing an error.

    here is the code:

    Code:
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null = null ']'
    ./createlist.sh: line 112: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null = null ']'
    ./createlist.sh: line 129: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 147: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 152: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 159: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 165: [: too many arguments
    
    ....
    and here is the output:

    Code:
    + INPUT_FILE='CreateTradeDataFeed:20110929:before:GPM_FIXINGS,GPM_RAW_CASH:null:null '
    + read -r i
    + read currentJBPM time_stamp report_type skip_node start_node runnodes
    ++ IFS=:
    ++ echo CreateTradeDataFeed 20110929 before GPM_FIXINGS,GPM_RAW_CASH null null
    + echo process: CreateTradeDataFeed
    process: CreateTradeDataFeed
    + echo timestamp: 20110929
    timestamp: 20110929
    + echo report type: before
    report type: before
    + echo skip: GPM_FIXINGS,GPM_RAW_CASH
    skip: GPM_FIXINGS,GPM_RAW_CASH
    + echo start: null
    start: null
    + echo runnodes: null
    runnodes: null
    ++ echo GPM_FIXINGS,GPM_RAW_CASH
    ++ tr , ' '
    + skip_node='GPM_FIXINGS GPM_RAW_CASH'
    ++ echo null
    ++ tr , ' '
    + runnodes=null
    + '[' CreateTradeDataFeed = null -o 20110929 = null -o before = null ']'
    + [[ 20110929 =~ ^[0-9]+$ ]]
    ++ date +%Y%m%d
    + datestamp=20110930
    + echo 'Please wait patiently while Reports are being generated...'
    Please wait patiently while Reports are being generated...
    + echo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    + echo 'Current Process: CreateTradeDataFeed'
    Current Process: CreateTradeDataFeed
    + echo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    + NEW_DIR=scripts/reports/EQDReports.sh
    + FULL_DIR=/app/mxjava/gpm_pat07/tds/scripts/reports/EQDReports.sh
    ++ date +%Y%m%d
    + datestamp=20110930
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null = null ']'
    ./createlist.sh: line 112: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null = null ']'
    ./createlist.sh: line 129: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 147: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH = null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 152: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null = null ']'
    ./createlist.sh: line 159: [: too many arguments
    + '[' GPM_FIXINGS GPM_RAW_CASH '!=' null -a null '!=' null -a null '!=' null ']'
    ./createlist.sh: line 165: [: too many arguments
    Am really confused! Plz help. Thanks!
  26. #14
  27. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,966
    Rep Power
    481

    awk named for its authors


    Alfred Aho, Peter Weinberger, and Brian Kernighan

    bison: yacc yet another compiler compiler

    flex: lex lexical analyzer

    ash: a shell for sh

    less: more

    I use the names of the gnu versions of the Bell lab originals.

    bash: born again shell for ... some combination of zsh, ksh, csh.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,477
    Rep Power
    1752
    Just show the plain output of a cat of your script
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo