UNIX Help
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsOperating SystemsUNIX Help

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old August 17th, 2006, 04:15 PM
samit_9999 samit_9999 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2006
Posts: 3 samit_9999 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 12 m 38 sec
Reputation Power: 0
Break a file based on a Column ,Copying header information

I have a file with data in the following way

abc.txt
Employee Sal
10 100
10 20
20 400
20 20
20 100
10 40
20 10

I want to create files dynamically based on the employee number . In the above case the output should be directed to 2 different files one for each employee as 10.txt and 20.txt

The header information or the 1 row from the original file should be copied over to all the sub-files that are created

10.txt
Employee Sal
10 100
10 20
10 40

20.txt
Employee Sal
20 400
20 20
20 100
20 10

Note -
1) i do not know the employee numbers in advance , it can be 2 employees or 20 employees
2) I also do not want to hard code the header information. It should be whatever is present in the original file

Please help me get around this , especially because i am very much new to unix

Reply With Quote
  #2  
Old August 17th, 2006, 05:22 PM
SimonJM SimonJM is offline
Contributing User
Dev Shed Novice (500 - 999 posts) Click here for more information
 
Join Date: Mar 2006
Posts: 762 SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 2 Weeks 1 Day 22 h 8 m 4 sec
Reputation Power: 336
Two issues here - the header and ... everything else!

Easy part: the header - assuming it is always at the start of the file:
Code:
head -1 abc.txt > header.txt

Now we get the tweaky bits!
Code:
for emp in `tail +2 abc.txt | awk '{print $1}' | sort -u`
do
  cp header.txt $emp.txt
  grep "^$emp " abc.txt >> $emp.txt
done

And that should do it for you

As for explanation:
The head -1 to get the header is a command to list the head of a file, the -1 is a parameter saying how many lines- in this case 1, and we are sending that single line to another file, called header.txt with the 'redirect' operator of >

The loop doing most of the work:
for emp in <--- defining the loop and saying for each iteration of the loop we will be using a variable called emp which will have the next value in.
The next bit of that line tells us (and the shell) just what we are going to be having in that variable. In this cae it i steh result of the comamnd tail +2 (list file starting from 2 lines in - i.e., skipping the header line) and piping (the | ) that output into an awk command which will, in this acse, print the first field (the employee number) and pass that on (via another pipe) to a sort command which will return just unique values (the -u). Thus, altogether we should be getting the variable emp to be, in turn, each employee number.
So, in that loop we copy (cp) the header.txt file to a file called $emp.txt (the $emp is a reference to the variable that contains the employee number). That sets us up with the copy of the header for each file.
Next we have a grep command that will check each line of the file for lines that start with the employee number (the "^$emp " - the space is important so that an employee number of 1 die snot find both 1 and 10!) and we send that output and append it to our $emp.txt file (with the >> re-direction) and then we go back and get $emp populated with the next number, until there are no more to be found.

Reply With Quote
  #3  
Old August 18th, 2006, 08:49 AM
samit_9999 samit_9999 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2006
Posts: 3 samit_9999 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 12 m 38 sec
Reputation Power: 0
Thanks Simon , that worked perfectly!!!!

Reply With Quote
  #4  
Old August 19th, 2006, 05:50 AM
SimonJM SimonJM is offline
Contributing User
Dev Shed Novice (500 - 999 posts) Click here for more information
 
Join Date: Mar 2006
Posts: 762 SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level)SimonJM User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 2 Weeks 1 Day 22 h 8 m 4 sec
Reputation Power: 336
A pleasure

Reply With Quote
Reply

Viewing: Dev Shed ForumsOperating SystemsUNIX Help > Break a file based on a Column ,Copying header information


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway
Stay green...Green IT