Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0

    Data Parsing (was: Need help in perl)


    I have to parse the below given data using perl script.

    Input file:

    B5F3Y8##MF01624.15;13-125;900-950@MF05188.12;133-258@MF05192.13;273-562@MF05190.13;430-521@MF00488.16;569-801@
    B5F414##MF01583.15;27-182@
    B5F415##MF00009.22;25-246@MF03144.20;261-325@
    B5F430##MF01077.17;172-331;425-559@MF03460.12;73-140;350-418@

    The numbers in the file should be sorted and the output format is given below,

    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.1 3;MF00488.16;MF01624.15;
    B5F414##MF01583.15;
    B5F415##MF00009.22;MF03144.20;
    B5F430##MF03460.12;MF01077.17;MF03460.12;MF01077.1 7;

    pls help me to parse these data using perl code.
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,251
    Rep Power
    1810
    There are a lot of ways to do it. Here's one:

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    while (<DATA>) {
    	chomp;
    	my @row = split /@/;
    	map { 
    		my $index = index $_, ';';
    		$_ = substr($_, 0, $index);
    	} @row;
    
    	my $result = join ';', @row;
    	print "$result\n";
    }
    
    __DATA__
    B5F3Y8##MF01624.15;13-125;900-950@MF05188.12;133-258@MF05192.13;273-562@MF05190.13;430-521@MF00488.16;569-801@
    B5F414##MF01583.15;27-182@
    B5F415##MF00009.22;25-246@MF03144.20;261-325@
    B5F430##MF01077.17;172-331;425-559@MF03460.12;73-140;350-418@
    Even though you said you wanted some numbers sorted, you didn't provide an example or explain whether you meant the fields or the lines. I'm going to leave that as an exercise for you.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    You may try something somewhat "LISPy" like this (only partly tested):

    Perl Code:
    while (my $line = <$FILE_IN>) {
         my ($start, $end) = split /##/, $line;
         my @fields =  map {/MF([\d.]+)/; $_=$1} grep {/MF/} split /;/, $end;
         my $out_line = $start . "##" . join ";",  map {$_ = "MF$_"} sort {$b <=> $a} @fields;
         # do what you want with $out_line
    }
    Last edited by Laurent_R; April 14th, 2013 at 01:51 PM.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0

    Need help in perl


    Originally Posted by keath
    There are a lot of ways to do it. Here's one:

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    while (<DATA>) {
    	chomp;
    	my @row = split /@/;
    	map { 
    		my $index = index $_, ';';
    		$_ = substr($_, 0, $index);
    	} @row;
    
    	my $result = join ';', @row;
    	print "$result\n";
    }
    
    __DATA__
    B5F3Y8##MF01624.15;13-125;900-950@MF05188.12;133-258@MF05192.13;273-562@MF05190.13;430-521@MF00488.16;569-801@
    B5F414##MF01583.15;27-182@
    B5F415##MF00009.22;25-246@MF03144.20;261-325@
    B5F430##MF01077.17;172-331;425-559@MF03460.12;73-140;350-418@
    Even though you said you wanted some numbers sorted, you didn't provide an example or explain whether you meant the fields or the lines. I'm going to leave that as an exercise for you.

    Thank You so much for your reply.
    The actual thing what I need is,
    For. eg. in first line the numbers ;13-125;900-950;etc have to be sorted in ascending order and the corresponding ids should be printed.

    Output:
    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.1 3;MF00488.16;MF01624.15;
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    Well, then, my code is not doing what you want since I understood that you wanted to sort the numbers after the MF string in reverse order (that's what I got from your examples).

    Originally Posted by reji123
    Output:
    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.1 3;MF00488.16;MF01624.15;
    Why do you have "MF01624.15" twice in your output?

    Please also explain how to sort the "13-125;900-950" numbers, since they are not numbers but pairs of numbers. Do you want them sorted according to the first number and then the second one?

    Also, the first field in each line does not have such numbers. How should it be sorted? Or is it the number that comes afterward that you want to use as a sort key?
    Last edited by Laurent_R; April 14th, 2013 at 12:23 PM.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    OK, for the fun, another "LISP-in-Perl" solution assuming that:
    - @ is the main record separator
    - each main record consists of a MFxxxxx idenfier and a pair of values, separated by ;
    - identifiers have to be sorted for the first value of the pair, and then for the second.
    - when identifiers have two pairs of values, the second one is ignored.

    Perl Code:
    while (my $line = <$FILE_IN>) {
         my ($start, $end) = split /##/, $line;
         my %fields =  map { my ($id, $val, undef) = 
              split /;/, $_; $id, [split /-/, $val];}
              split /@/, $end;
         my $out_line = $start . "##" . join ";", 
              sort {$fields{$a}->[0] <=> $fields{$b}->[0] 
                 or $fields{$a}->[1] <=> $fields{$b}->[1]} 
              keys %fields;
         # do what you want with $out_line
    }


    For the first line of your input, this produces:

    Code:
    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.13;MF00488.16
    which is, I believe, what you want.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    7
    Rep Power
    0
    What actually is this pearl programming? Can any one let me know?
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0

    Need help in perl


    Originally Posted by Laurent_R
    Well, then, my code is not doing what you want since I understood that you wanted to sort the numbers after the MF string in reverse order (that's what I got from your examples).



    Why do you have "MF01624.15" twice in your output?

    Please also explain how to sort the "13-125;900-950" numbers, since they are not numbers but pairs of numbers. Do you want them sorted according to the first number and then the second one?

    Also, the first field in each line does not have such numbers. How should it be sorted? Or is it the number that comes afterward that you want to use as a sort key?

    Yes MF01624.15 should come twice in my output since for MF01624.15 I am having two positions 13-125 and 900-950. So I have to sort the first number and print the corresponding ids.

    assume MF01624.15 has 13-125;259-270;900-950 3 positions
    then my output file should be

    B5F3Y8##MF01624.15;MF05188.12;MF01624.15;MF05192.13;MF05190.13;MF00488.16;MF01624.15;

    Hope you got it.
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    Originally Posted by SharonClark
    What actually is this pearl programming? Can any one let me know?
    This is just normal regular Perl (not pearl) programming, but with an emphasis on the use of list operators, which are quite handy when you need to apply a series of transformations on a list of items.

    Plese let me know if you need additional explanations.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    Originally Posted by reji123
    Yes MF01624.15 should come twice in my output since for MF01624.15 I am having two positions 13-125 and 900-950. So I have to sort the first number and print the corresponding ids.

    assume MF01624.15 has 13-125;259-270;900-950 3 positions
    then my output file should be

    B5F3Y8##MF01624.15;MF05188.12;MF01624.15;MF05192.13;MF05190.13;MF00488.16;MF01624.15;

    Hope you got it.
    Yes, I understand what you want.

    You still have not answer this question:

    Please also explain how to sort the "13-125;900-950" numbers, since they are not numbers but pairs of numbers. Do you want them sorted according to the first number and then the second one?
    I am not going to code something again if you don't give the full specification. You would have had a working program already yesterday if you had explained exactly what you needed right away.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0
    Originally Posted by Laurent_R
    Yes, I understand what you want.

    You still have not answer this question:



    I am not going to code something again if you don't give the full specification. You would have had a working program already yesterday if you had explained exactly what you needed right away.
    Sorry. In my previous mail itself I have mentioned that I have to sort only the first number. for. eg. 13, 900 etc.
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    Hi,

    yes, you are right, reji123, you said it, I just did not understand it was the answer to the question.

    This is the new version:

    Perl Code:
    while (my $line = <$FILE_IN>) {
    	my ($start, $end) = split /##/, $line;
    	my @fields =  map { my ($id, @values) = split /;/, $_; 
    			@values = map { (split /-/, $_)[0]} @values;  
                map {[$id, $_]}  @values;}
    			split /@/, $end;
    	my $out_line = $start . "##" . join ";",
    		map {$_->[0]} 
    		sort {$a->[1] <=> $b->[1]} @fields
    	# do what you want with $out_line
    }


    This produces the following output for your first line:

    Code:
    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.13;MF00488.16;MF01624.15
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0
    Originally Posted by Laurent_R
    Hi,

    yes, you are right, reji123, you said it, I just did not understand it was the answer to the question.

    This is the new version:

    Perl Code:
    while (my $line = <$FILE_IN>) {
    	my ($start, $end) = split /##/, $line;
    	my @fields =  map { my ($id, @values) = split /;/, $_; 
    			@values = map { (split /-/, $_)[0]} @values;  
                map {[$id, $_]}  @values;}
    			split /@/, $end;
    	my $out_line = $start . "##" . join ";",
    		map {$_->[0]} 
    		sort {$a->[1] <=> $b->[1]} @fields
    	# do what you want with $out_line
    }


    This produces the following output for your first line:

    Code:
    B5F3Y8##MF01624.15;MF05188.12;MF05192.13;MF05190.13;MF00488.16;MF01624.15
    Thank you so much. Its working.
    Can you please give explanation on each function and what it does or if you can make up the program by using basic functions then it will be helpful for me.
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    828
    Rep Power
    496
    Originally Posted by reji123
    Thank you so much. Its working.
    Can you please give explanation on each function and what it does or if you can make up the program by using basic functions then it will be helpful for me.
    OK, I'll break it up into smaller pieces, so that it is more easily understood.
  28. #15
  29. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    8
    Rep Power
    0
    Originally Posted by Laurent_R
    OK, I'll break it up into smaller pieces, so that it is more easily understood.

    Yah. Thank You
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo