|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Break a file based on a Column ,Copying header information
I have a file with data in the following way
abc.txt Employee Sal 10 100 10 20 20 400 20 20 20 100 10 40 20 10 I want to create files dynamically based on the employee number . In the above case the output should be directed to 2 different files one for each employee as 10.txt and 20.txt The header information or the 1 row from the original file should be copied over to all the sub-files that are created 10.txt Employee Sal 10 100 10 20 10 40 20.txt Employee Sal 20 400 20 20 20 100 20 10 Note - 1) i do not know the employee numbers in advance , it can be 2 employees or 20 employees 2) I also do not want to hard code the header information. It should be whatever is present in the original file Please help me get around this , especially because i am very much new to unix |
|
#2
|
|||
|
|||
|
Two issues here - the header and ... everything else!
Easy part: the header - assuming it is always at the start of the file: Code:
head -1 abc.txt > header.txt Now we get the tweaky bits! Code:
for emp in `tail +2 abc.txt | awk '{print $1}' | sort -u`
do
cp header.txt $emp.txt
grep "^$emp " abc.txt >> $emp.txt
done
And that should do it for you As for explanation: The head -1 to get the header is a command to list the head of a file, the -1 is a parameter saying how many lines- in this case 1, and we are sending that single line to another file, called header.txt with the 'redirect' operator of > The loop doing most of the work: for emp in <--- defining the loop and saying for each iteration of the loop we will be using a variable called emp which will have the next value in. The next bit of that line tells us (and the shell) just what we are going to be having in that variable. In this cae it i steh result of the comamnd tail +2 (list file starting from 2 lines in - i.e., skipping the header line) and piping (the | ) that output into an awk command which will, in this acse, print the first field (the employee number) and pass that on (via another pipe) to a sort command which will return just unique values (the -u). Thus, altogether we should be getting the variable emp to be, in turn, each employee number. So, in that loop we copy (cp) the header.txt file to a file called $emp.txt (the $emp is a reference to the variable that contains the employee number). That sets us up with the copy of the header for each file. Next we have a grep command that will check each line of the file for lines that start with the employee number (the "^$emp " - the space is important so that an employee number of 1 die snot find both 1 and 10!) and we send that output and append it to our $emp.txt file (with the >> re-direction) and then we go back and get $emp populated with the next number, until there are no more to be found. |
|
#3
|
|||
|
|||
|
Thanks Simon , that worked perfectly!!!!
|
|
#4
|
|||
|
|||
|
A pleasure
|
![]() |
| Viewing: Dev Shed Forums > Operating Systems > UNIX Help > Break a file based on a Column ,Copying header information |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|