The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Operating Systems
> UNIX Help
|
Help on unix script to join similar lines of input
Discuss Help on unix script to join similar lines of input in the UNIX Help forum on Dev Shed. Help on unix script to join similar lines of input UNIX Help forum discussing the Unix Operating System and all variants including Irix, Solarix, and AIX. Unix was designed as a true multi-user operating system.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

June 27th, 2012, 08:24 PM
|
|
Registered User
|
|
Join Date: Jun 2012
Posts: 5
Time spent in forums: 32 m 59 sec
Reputation Power: 0
|
|
|
Help on unix script to join similar lines of input
Hi,
I have been thinking of how to script this but i have no clue at all..
Could someone please help me out or give me some idea on this?
I would like to group those lines with the same first variable in each line, joining the 2nd variables with commas.
Let's say i have the following input.
Code:
aa c1
aa c2
aa c3
cc d1
dd e1
dd e2
ee f1
I would like the output to be like this.
Code:
aa c1,c2,c3
cc d1
dd e1,e2
ee f1
Could this be easily done with bash script?
Or should i try perl script then?
I'm a beginner in bash script and perl.
Thank you.
|

June 27th, 2012, 11:01 PM
|
|
|
Are the lines sorted? If so it should be reasonably simple to do in almost any scripting language from perl to awk or 'plain' bash script.
'All' you need do is track the value in the first column as you read the input and if it is the same as the last value read append the value in the second column to a variable. If the value in the first column is different (and you have output to show) do the output and clear the variable down and reset the current value of the first column.
Code:
awk 'BEGIN { x=0; c1=""; c2="" }
{
if ($1 != c) { if (c1 != "") { print c1, c2 } c1=$1;c2=$2;x=1 }
else { x += 1 ; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
}
END { if (x !=0) { print c1, c2 } }' your_input_file.txt
__________________
The moon on the one hand, the dawn on the other:
The moon is my sister, the dawn is my brother.
The moon on my left and the dawn on my right.
My brother, good morning: my sister, good night.
-- Hilaire Belloc
|

June 29th, 2012, 02:49 AM
|
|
Registered User
|
|
Join Date: Jun 2012
Posts: 5
Time spent in forums: 32 m 59 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by SimonJM Are the lines sorted? If so it should be reasonably simple to do in almost any scripting language from perl to awk or 'plain' bash script.
'All' you need do is track the value in the first column as you read the input and if it is the same as the last value read append the value in the second column to a variable. If the value in the first column is different (and you have output to show) do the output and clear the variable down and reset the current value of the first column.
Code:
awk 'BEGIN { x=0; c1=""; c2="" }
{
if ($1 != c) { if (c1 != "") { print c1, c2 } c1=$1;c2=$2;x=1 }
else { x += 1 ; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
}
END { if (x !=0) { print c1, c2 } }' your_input_file.txt
|
sorry, it doesnt work for me...
Anyway i've found a short solution to this.
${input} is the filename for the input file.
Code:
for m in `cat ${input} | awk '{print $1}' | sort | uniq `
do
var=`grep "^${m} " ${output} | awk '{print $2}' | tr '\n' ',' | sed '$s/,$//'`
echo "${m} ${var}"
done
Thanks anyway.
|

June 29th, 2012, 08:11 AM
|
|
|
Gah! Simple typo, sorry!
Code:
awk 'BEGIN { x=0;c1="";c2="" }
{
if ($1 != c1) { if (c1 != "" ) { print c1,c2 } c1=$1;c2=$2;x=1 }
else { x += 1; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
}
END { if (x != 0) { print c1,c2 } }' Your_input_file.txt
Glad you got it sorted another way. You could replace the | sort | uniq with a simple sort -u if you wished.
|

July 1st, 2012, 09:09 PM
|
|
Registered User
|
|
Join Date: Jun 2012
Posts: 5
Time spent in forums: 32 m 59 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by SimonJM Gah! Simple typo, sorry!
Code:
awk 'BEGIN { x=0;c1="";c2="" }
{
if ($1 != c1) { if (c1 != "" ) { print c1,c2 } c1=$1;c2=$2;x=1 }
else { x += 1; if (x == 0) { c2=$2 } else { c2=c2","$2 } }
}
END { if (x != 0) { print c1,c2 } }' Your_input_file.txt
Glad you got it sorted another way. You could replace the | sort | uniq with a simple sort -u if you wished. |
It works now!
I need to study yoru command...
thank you so much!
|

July 9th, 2012, 11:05 AM
|
 |
Contributing User
|
|
|
|
Quote: | Originally Posted by rei125 It works now!
I need to study yoru command...
thank you so much! |
Better use this:
Code:
awk '{if(k!=$1)c=""; a[$1]=a[$1] c $2; c=",";k=$1}
END {for (i in a) print i,a[i]}
' Your_input_file.txt | sort

__________________
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|