#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    3
    Rep Power
    0

    Comparing two files and display matched results


    I've got the following problem to solve...I have two files containing the following information:

    a.txt

    Code:
     alan, 23, alan@yahoo.com
    albert, 27, albert@yahoo.com
    b.txt

    Code:
     alan:173:analyst
    victor:149:director
    albert:171:clerk
    coste:27:driver
    I need to extract name(zero field) from every line of both files, compare them and if they match, print age and occupation information. Thus, my output should be:

    Code:
     alan, 23, analyst
    albert, 27, clerk
    What I have got so far, and it's not working:

    Code:
     open F2, 'a.txt' or die $!;
    @interesting_lines = <F2>;
    
    foreach $line (@interesting_lines ) {
    @string = split(', ', $line);
    print "$string[0]\n";
    }
    close F2;
    
    open F1, 'b.txt' or die $!;
    while (defined(my $line = <F1>)) {
    @string2 = split(':', $line);
    print $string2[0];
    
    print "$.:\t$string2[0]" if grep {$string2[0] eq $_} $string[0] ; }
    Does anyone have any ideas how can I implement my requirements? Thanks... Ps, bith files might have more lines than I posted, but file b.txt will always have every name that file a.tx has, plus extra lines.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2011
    Posts
    46
    Rep Power
    57
    Hi, haven't got time to try coding and running things at the moment, but some suggestions:

    Try first loading one file into a hash structure, creating records with name as key and the other fields as hash members.

    Then loop through the other file checking if name exists in the hash (exists()) and if so create your output line from the combination of fields in the hash and the file being read.

    Also note the split takes a regex as the delimiter not a a string, so won't be working as you expect at the moment :

    split /PATTERN/,EXPR,LIMIT

    Originally Posted by garryBrown
    I've got the following problem to solve...I have two files containing the following information:

    a.txt

    Code:
     alan, 23, alan@yahoo.com
    albert, 27, albert@yahoo.com
    b.txt

    Code:
     alan:173:analyst
    victor:149:director
    albert:171:clerk
    coste:27:driver
    I need to extract name(zero field) from every line of both files, compare them and if they match, print age and occupation information. Thus, my output should be:

    Code:
     alan, 23, analyst
    albert, 27, clerk
    What I have got so far, and it's not working:

    Code:
     open F2, 'a.txt' or die $!;
    @interesting_lines = <F2>;
    
    foreach $line (@interesting_lines ) {
    @string = split(', ', $line);
    print "$string[0]\n";
    }
    close F2;
    
    open F1, 'b.txt' or die $!;
    while (defined(my $line = <F1>)) {
    @string2 = split(':', $line);
    print $string2[0];
    
    print "$.:\t$string2[0]" if grep {$string2[0] eq $_} $string[0] ; }
    Does anyone have any ideas how can I implement my requirements? Thanks... Ps, bith files might have more lines than I posted, but file b.txt will always have every name that file a.tx has, plus extra lines.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    2
    Rep Power
    0

    Thumbs up


    Hi,
    The below code will be fine for you.

    Code:
    open (AFILE, "a.txt") or die "unable to open file $!";
    my @acon=<AFILE>;
    close (AFILE);
    
    open (BFILE, "b.txt") or die "unable to open file $!";
    my $bcon=join("",<BFILE>);
    close (BFILE);
    
    foreach my $match (@acon)
       {
    	my ($name,$age,$email)=split(/,/,$match);
    	#print "test: $bcon\n";
    	if($bcon=~m/$name(.*?)\n/i)
    	{
    		my $nmatch=$&;
    		my ($nametwo,$id,$occupation)=split(/:/,$nmatch);
    		print "$nametwo, $age, $occupation";
    		
    	}
    
       }
    GoodLuck,
    Subbaiya Arasu
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    830
    Rep Power
    496
    Originally Posted by dethorpe
    Also note the split takes a regex as the delimiter not a a string, so won't be working as you expect at the moment :

    split /PATTERN/,EXPR,LIMIT
    I agree this is not very good practice but it will work nonetheless, as shown by this session under the Perl debugger:

    Code:
      DB<1> $c = "alan:173:analyst"
    
      DB<2> @d = split ":", $c
    
      DB<3> x \@d
    0  ARRAY(0x80359d40)
       0  'alan'
       1  173
       2  'analyst'
      DB<4>
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    830
    Rep Power
    496
    Originally Posted by Arasu84
    Hi,
    The below code will be fine for you.

    Perl Code:
    open (AFILE, "a.txt") or die "unable to open file $!";
    my @acon=<AFILE>;
    close (AFILE);
     
    open (BFILE, "b.txt") or die "unable to open file $!";
    my $bcon=join("",<BFILE>);
    close (BFILE);
     
    foreach my $match (@acon)
       {
    	my ($name,$age,$email)=split(/,/,$match);
    	#print "test: $bcon\n";
    	if($bcon=~m/$name(.*?)\n/i)
    	{
    		my $nmatch=$&;
    		my ($nametwo,$id,$occupation)=split(/:/,$nmatch);
    		print "$nametwo, $age, $occupation";
     
    	}
     
       }

    This will probably work, but the solution proposed by Dethorpe of using a hash for storing the first file is in my view a far better idea and then reading line by line the second file. It will be also faster and will occupy less memory, which might be of some importance if the files are large.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    30
    Rep Power
    1
    Originally Posted by garryBrown
    I've got the following problem to solve...I have two files containing the following information:

    I need to extract name(zero field) from every line of both files, compare them and if they match, print age and occupation information. Thus, my output should be:

    Code:
     alan, 23, analyst
    albert, 27, clerk
    What I have got so far, and it's not working:

    Code:
     open F2, 'a.txt' or die $!;
    @interesting_lines = <F2>;
    
    foreach $line (@interesting_lines ) {
    @string = split(', ', $line);
    print "$string[0]\n";
    }
    close F2;
    
    open F1, 'b.txt' or die $!;
    while (defined(my $line = <F1>)) {
    @string2 = split(':', $line);
    print $string2[0];
    
    print "$.:\t$string2[0]" if grep {$string2[0] eq $_} $string[0] ; }
    Does anyone have any ideas how can I implement my requirements? Thanks... Ps, bith files might have more lines than I posted, but file b.txt will always have every name that file a.tx has, plus extra lines.
    I agree with dethorpe open the first file using split function on each line and put that into an hash, then open the second file just like the first file and check if each name as key in the hash exists in the second file.
    Print the name, the age and the occupation.

    But this is the catch, since you are opening your file more than once, then use a function to do that instead of open files as many times as possible.

    Moreover, instead of using a split function why not use the module Text::CSV or Text::CSV_XS instead. Then changing the sep_char for both of the file is just a simple manipulation.
    Comments on your Code
    use warnings; and
    use strict; In your code. It helps a million.

    Secondly, don't use a BAREWORD as a filehandle like F1 and F2, it's better to use a lexical scoped filehandle. Use three argument open function like Open my $fh,'<', $filename or die "can't open file: $!"; Indicating what you are doing, showing whether you are reading, writing, appending or whatever.

    Passing the two files from the CLI to a perlscript, Something like this will do:
    Code:
    use warnings;
    use strict;
    use Text::CSV;
    
    my %hash;
    
    reader(
        {
            file     => shift,
            sep      => ',',
            code_ref => sub {
                my $n = shift;
                $hash{ $n->[0] } = $n->[1];
            },
        }
    );
    
    reader(
        {
            file     => shift,
            sep      => ':',
            code_ref => sub {
                my $n = shift;
                print join( ',' => $n->[0], $hash{ $n->[0] }, $n->[2] ), $/
                  if $hash{ $n->[0] };
            },
        }
    );
    
    sub reader {
        my $hash_ref = shift;
        my $csv = Text::CSV->new( { binary => 1, sep_char => $hash_ref->{sep},} );
        open my $fh, '<', $hash_ref->{file} or die "can't use CSV: ".Text::CSV->error_diag();
        while ( my $data = $csv->getline($fh) ) {
            ( $hash_ref->{code_ref} )->($data);
        }
        $csv->eof or $csv->error_diag();
    }
    What is going on above? I passed the file, the separator character which is , and : and a code reference to a subroutine as a hash reference.
    Then since, from file1 we need the col 0 and 1. However, in file2 we need col 0 and 2. So, the need for $n->[0] and $n->[2].
    Where $n->[0] is the name of the person in both files.
    Hope this helps.

IMN logo majestic logo threadwatch logo seochat tools logo