#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    3
    Rep Power
    0

    CSV file Columns check and delete


    Hi All,

    i have a CSV file like below,

    col1 col2 col3 col4 col5 col6 col7
    a1 a2 a3 a4 a5 a6 a7
    b1 b2 b3 b4 b5 b6 b7
    c1 c2 c3 c4 c5 c6 c7
    d1 d2 d3 d4 d5 d6 d7

    Col1,col2.. are the column names and a1,a2,b1,b2.. are the data of the file.

    i need to write a shell script, for here i need a check like which ever the unnecessary columns from the CSVfile have to be deleted, and mandatory columns sholud be specified in the scripts itself.

    can any body help me on this for writing the shell script.

    thanks in advance.
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    spaceBAR Central
    Posts
    225
    Rep Power
    41
    If your file is TAB delimited, This is an example of deleting the 2nd column
    Code:
    $ cat t2
    col1    col2    col3    col4    col5    col6    col7
    a1      a2      a3      a4      a5      a6      a7
    b1      b2      b3      b4      b5      b6      b7
    c1      c2      c3      c4      c5      c6      c7
    d1      d2      d3      d4      d5      d6      d7
    
    $ sed 's/^\([0-9a-zA-Z]*\t\)[0-9a-zA-Z]*\t/\1/' t2
    col1    col3    col4    col5    col6    col7
    a1      a3      a4      a5      a6      a7
    b1      b3      b4      b5      b6      b7
    c1      c3      c4      c5      c6      c7
    d1      d3      d4      d5      d6      d7


    If your file is COMMA delimited, This is an example of deleting the 2nd column
    Code:
    $ cat t2
    col1,col2,col3,col4,col5,col6,col7
    a1,a2,a3,a4,a5,a6,a7
    b1,b2,b3,b4,b5,b6,b7
    c1,c2,c3,c4,c5,c6,c7
    d1,d2,d3,d4,d5,d6,d7
    
    $ sed 's/^\([0-9a-zA-Z]*,\)[0-9a-zA-Z]*,/\1/' t2
    col1,col3,col4,col5,col6,col7
    a1,a3,a4,a5,a6,a7
    b1,b3,b4,b5,b6,b7
    c1,c3,c4,c5,c6,c7
    d1,d3,d4,d5,d6,d7
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    3
    Rep Power
    0
    Actually i need it like a shell script. which are the necessary columns i have to specify in the script itself, based on that remaining columns should need to delete.
  6. #4
  7. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,364
    Rep Power
    1870
    Perhaps the cut tool is more appropriate.
    Code:
    $ cat file1.txt
    col1,col2,col3,col4,col5,col6,col7
    a1,a2,a3,a4,a5,a6,a7
    b1,b2,b3,b4,b5,b6,b7
    c1,c2,c3,c4,c5,c6,c7
    d1,d2,d3,d4,d5,d6,d7
    $ cut -d',' -f2,4,6 file1.txt
    col2,col4,col6
    a2,a4,a6
    b2,b4,b6
    c2,c4,c6
    d2,d4,d6
    $ cut --complement -d',' -f2,4,6 file1.txt
    col1,col3,col5,col7
    a1,a3,a5,a7
    b1,b3,b5,b7
    c1,c3,c5,c7
    d1,d3,d5,d7
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    3
    Rep Power
    0
    No i have taken an array like

    ArrayName=("a1" "a3" "a4" "a5" "a6" "a7" "a8")

    these are the fiels names in CSV file, and along with these field name there are so many number of Field Names in CSV file. now i want to delete the Fields which are not present in the ArrayName.

    can any one help me on this for writing a shell scripts.

    Thanks in advance.
  10. #6
  11. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,364
    Rep Power
    1870
    > No i have taken an array like
    > ArrayName=("a1" "a3" "a4" "a5" "a6" "a7" "a8")
    Wait a minute - a1 etc were fields in the CSV, and col1 etc were column names in post #1.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,396
    Rep Power
    1688
    If you could get that array into a plain text/string then:

    Code:
    echo \("a1" "a3" "a4" "a5" "a6" "a7" "a8"\) | tr -d "[:alpha:]|[:punct:]" | tr -s " " ","
    Would give you the column numbers, comma-delimited suitable for use in a cut -f command as mentioned by salem, above.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc

IMN logo majestic logo threadwatch logo seochat tools logo