#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    Numeric Sort fo tab delimited file


    Greetings, Need help, read input file:
    w r t y
    6 re 5 6
    5 ee 4 2
    2 54 3 2
    3 ew 1 1

    I need to sort numerically only column 0 starting at row 1(6). And output all the entire lines into another file.
    So my $desfile shoud look like this:
    w r t y
    2 54 3 2
    3 ew 1 1
    5 ee 4 2
    6 re 5 6

    thanks a lot
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,221
    Rep Power
    1809
    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    # remove first line from file
    my $header = <DATA>;
    
    my @lines;
    while (<DATA>) {
    	push @lines, [split /\s/];
    }
    
    my @sorted = sort {$a->[0] <=> $b->[0]} @lines;
    
    foreach my $line (@sorted) {
    	print join("\t", @$line), "\n";
    }
    
    __DATA__
    w	r	t	y
    6	re	5	6
    5	ee	4	2
    2	54	3	2
    3	ew	1	1
    My input is within the file, and I'm directing the output to STDOUT. Those are the only changes you would need to make.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    thanks


    thank keath,

    Will try it in a while.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    779
    Rep Power
    495
    Just a variation on Keath's code, a classical and elegant construct known as the "Schwartzian Transform" (just Google it if you want to know more), doing exactly the same thing but with a very a different syntax:

    Perl Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
     
    print map {"@$_\n"}
    sort {$a->[0] <=> $b->[0]}
    map {[split /\s+/]} <DATA>;
     
    __DATA__
    6	re	5	6
    5	ee	4	2
    2	54	3	2
    3	ew	1	1


    Here, the whole work is done in just one single Perl instruction (split on three lines to improve readability above).


    To understand it, you need to read it from right to left and from bottom to top: <DATA> is producing a list of lines passed to the map bloc code at the bottom, which transforms the lines into an array of arrays in which the first element if the column that is of interest for your sort. The, on the line above, the sort is performed with the first element of each array being the comparison key for the sort. Finally, the map on the line above is reconstructing a text line from each of the arrays.
    Last edited by Laurent_R; October 10th, 2013 at 05:09 PM.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    ok, so


    very nice///

    how would I write that out to a file of say the same name in a different directory?

    thanks

    Originally Posted by Laurent_R
    Just a variation on Keath's code, a classical and elegant construct known as the "Schwartzian Transform" (just Google it if you want to know more), doing exactly the same thing but with a very a different syntax:

    Perl Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
     
    print map {"@$_\n"}
    sort {$a->[0] <=> $b->[0]}
    map {[split /\s+/]} <DATA>;
     
    __DATA__
    6	re	5	6
    5	ee	4	2
    2	54	3	2
    3	ew	1	1


    Here, the whole work is done in just one single Perl instruction (split on three lines to improve readability above).


    To understand it, you need to read it from right to left and from bottom to top: <DATA> is producing a list of lines passed to the map bloc code at the bottom, which transforms the lines into an array of arrays in which the first element if the column that is of interest for your sort. The, on the line above, the sort is performed with the first element of each array being the comparison key for the sort. Finally, the map on the line above is reconstructing a text line from each of the arrays.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    779
    Rep Power
    495
    Something along these lines (untested):

    Perl Code:
    my $file_in = "input_dir/file.txt";
    my $file_out = "output_dir/file.txt";
     
    open my $IN, "<",  $file_in or die "cannot open $file_in $!";
    open my $OUT, ">",  $file_out or die "cannot open $file_out $!";
     
    # store first line separately
    my $header = <$IN>;
    print $OUT $header;
     
    print $OUT map {"@$_\n"}
    sort {$a->[0] <=> $b->[0]}
    map {[split /\s+/]} <$IN>;
     
    close $_ for ($IN, $OUT);
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    new question


    Greetings,
    Thanks for the last responses, they helped a lot..
    New question:
    Have huge files, need to look at columns 2 & 3(after the header row) of a .csv file.
    If they are both repeated anywhere in the file, I need to only save the last occurrence.
    thanks,
    Rodney



    Originally Posted by Laurent_R
    Something along these lines (untested):

    Perl Code:
    my $file_in = "input_dir/file.txt";
    my $file_out = "output_dir/file.txt";
     
    open my $IN, "<",  $file_in or die "cannot open $file_in $!";
    open my $OUT, ">",  $file_out or die "cannot open $file_out $!";
     
    # store first line separately
    my $header = <$IN>;
    print $OUT $header;
     
    print $OUT map {"@$_\n"}
    sort {$a->[0] <=> $b->[0]}
    map {[split /\s+/]} <$IN>;
     
    close $_ for ($IN, $OUT);
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    779
    Rep Power
    495
    What do you mean exactly by huge files? How many lines? This will really make the difference between a very simple solution and one that will be more complicated.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    hello


    ~ 25 wafers by 48 sites..
    ~ not that big I guess.. ~1200 lines, thanks

    Originally Posted by Laurent_R
    What do you mean exactly by huge files? How many lines? This will really make the difference between a very simple solution and one that will be more complicated.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    779
    Rep Power
    495
    Then use a hash. Read your file and, as you go along check in the hash if you have already seen the value, and add it to the hash if you haven't seen it.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    thank


    I will try that out soon, thanks so much

    Originally Posted by Laurent_R
    Then use a hash. Read your file and, as you go along check in the hash if you have already seen the value, and add it to the hash if you haven't seen it.
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    follow up


    how would you check to see if the line was already in your hash?
    Could have been 100 lines ago.. thanks


    Originally Posted by Laurent_R
    Then use a hash. Read your file and, as you go along check in the hash if you have already seen the value, and add it to the hash if you haven't seen it.
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    also


    Hi,
    What if the occurrence happened 100 lines earlier?
    I need to replace the earlier occurrence with the new occurrence..
    I guess I would have to read the file at least twice and loop?
    sorry and thanks.

    Originally Posted by az_perlberd
    how would you check to see if the line was already in your hash?
    Could have been 100 lines ago.. thanks
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    779
    Rep Power
    495
    Just an example of a program fragment for removing duplicates from a file.

    Perl Code:
    my %already_seen;
    while (<$IN>) {
         next if exists $already_seen{$_};
         print $_;
         $already_seen{$_} = 1;
    }
  28. #15
  29. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    9
    Rep Power
    0

    thanks


    thank you Laurent_R.

    Originally Posted by Laurent_R
    Just an example of a program fragment for removing duplicates from a file.

    Perl Code:
    my %already_seen;
    while (<$IN>) {
         next if exists $already_seen{$_};
         print $_;
         $already_seen{$_} = 1;
    }

IMN logo majestic logo threadwatch logo seochat tools logo