#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    5
    Rep Power
    0

    Help on unix script to join similar lines of input


    Hi,

    I have been thinking of how to script this but i have no clue at all..
    Could someone please help me out or give me some idea on this?

    I would like to group those lines with the same first variable in each line, joining the 2nd variables with commas.
    Let's say i have the following input.
    Code:
    aa c1
    aa c2
    aa c3
    cc d1
    dd e1
    dd e2
    ee f1
    I would like the output to be like this.
    Code:
    aa c1,c2,c3
    cc d1
    dd e1,e2
    ee f1
    Could this be easily done with bash script?
    Or should i try perl script then?
    I'm a beginner in bash script and perl.

    Thank you.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,481
    Rep Power
    1752
    Are the lines sorted? If so it should be reasonably simple to do in almost any scripting language from perl to awk or 'plain' bash script.

    'All' you need do is track the value in the first column as you read the input and if it is the same as the last value read append the value in the second column to a variable. If the value in the first column is different (and you have output to show) do the output and clear the variable down and reset the current value of the first column.

    Code:
    awk 'BEGIN { x=0; c1=""; c2="" }
       {
         if ($1 != c) { if (c1 != "") { print c1, c2 } c1=$1;c2=$2;x=1 }
         else { x += 1 ; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
       }
       END { if (x !=0) { print c1, c2 } }' your_input_file.txt
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    5
    Rep Power
    0
    Originally Posted by SimonJM
    Are the lines sorted? If so it should be reasonably simple to do in almost any scripting language from perl to awk or 'plain' bash script.

    'All' you need do is track the value in the first column as you read the input and if it is the same as the last value read append the value in the second column to a variable. If the value in the first column is different (and you have output to show) do the output and clear the variable down and reset the current value of the first column.

    Code:
    awk 'BEGIN { x=0; c1=""; c2="" }
       {
         if ($1 != c) { if (c1 != "") { print c1, c2 } c1=$1;c2=$2;x=1 }
         else { x += 1 ; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
       }
       END { if (x !=0) { print c1, c2 } }' your_input_file.txt
    sorry, it doesnt work for me...

    Anyway i've found a short solution to this.
    ${input} is the filename for the input file.

    Code:
    for m in `cat ${input} | awk '{print $1}' | sort | uniq `
    do
            var=`grep "^${m} " ${output} | awk '{print $2}' | tr '\n' ',' | sed '$s/,$//'`
            echo "${m} ${var}"
    done
    Thanks anyway.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,481
    Rep Power
    1752
    Gah! Simple typo, sorry!

    Code:
    awk 'BEGIN { x=0;c1="";c2="" }
    {
     if ($1 != c1) { if (c1 != "" ) { print c1,c2 } c1=$1;c2=$2;x=1 }
     else { x += 1; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
    }
    END { if (x != 0) { print c1,c2 } }' Your_input_file.txt
    Glad you got it sorted another way. You could replace the | sort | uniq with a simple sort -u if you wished.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    5
    Rep Power
    0
    Originally Posted by SimonJM
    Gah! Simple typo, sorry!

    Code:
    awk 'BEGIN { x=0;c1="";c2="" }
    {
     if ($1 != c1) { if (c1 != "" ) { print c1,c2 } c1=$1;c2=$2;x=1 }
     else { x += 1; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
    }
    END { if (x != 0) { print c1,c2 } }' Your_input_file.txt
    Glad you got it sorted another way. You could replace the | sort | uniq with a simple sort -u if you wished.
    It works now!
    I need to study yoru command...
    thank you so much!
  10. #6
  11. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Sep 2006
    Posts
    859
    Rep Power
    388
    Originally Posted by rei125
    It works now!
    I need to study yoru command...
    thank you so much!
    Better use this:
    Code:
    awk '{if(k!=$1)c=""; a[$1]=a[$1] c $2; c=",";k=$1}
    END {for (i in a) print i,a[i]}
    ' Your_input_file.txt | sort

IMN logo majestic logo threadwatch logo seochat tools logo