#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    4
    Rep Power
    0

    Joining 2 csv files on columns and displaying filtered data


    I have 2 csv files A.csv and B.CSV. I want to filter all records in A.CSV that have a value in balance
    for banace_duedate < 10/23/2007.I want to match the files on account_number with B.CSV and print records that match.

    A.CSV
    id account_number balance balance_due
    1 ar161429 482.29 11/28/2007 15:54
    2 ar182364 266.93 10/9/2007 15:54
    3 ar106644 887.78 10/23/2007 15:54

    B.csv
    id name created_date account_number overdue
    20 Reina Kaka 4/6/2007 15:36 ar106644 1
    21 Justine Timber 1/2/2007 15:36 ar465032
    7 Woody Jody 3/7/2007 15:36 ar161429 0
    10 Mayson Tim 12/1/2006 15:36 ar182364 1
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,254
    Rep Power
    1810
    Originally Posted by myhome
    I have 2 csv files A.csv and B.CSV. I want to filter all records in A.CSV that have a value in balance
    for banace_duedate < 10/23/2007.I want to match the files on account_number with B.CSV and print records that match.
    The approach I would take is to read in B.CSV into a hash first. Use the account number as the key.

    Then read A.csv line by line. If you are using a high enough version of perl, you can use Time::Piece to convert the date-times into object for easy comparison. There are other modules, such as DateTime you can use instead.

    For any lines with date matches, the account lookup will be easy from the hash and you can print any data you need.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    4
    Rep Power
    0

    I read them into an array.


    I am using arrays

    #!/usr/bin/perl -w


    my @account_id;
    my @account_row;
    my @acct_cmp;
    my @final_accounts;
    d

    account_list();
    balance_list();
    join_file();

    foreach $final(@final_accounts)
    {
    print "$final \n";
    }


    sub join_file
    {

    foreach $act(@account_id)
    {

    foreach $row(@account_row)

    {


    @acct_cmp = split(",",$row);


    if ( $act = $acct_cmp[3] )
    {
    print "$act = $acct_cmp[3] \n";
    push(@final_accounts,$row);

    }
    last if $row=$flag;

    }
    }

    }


    #foreach $act(@account_id)
    # {
    # print "$act \n";
    # }


    sub account_list
    {

    open (ACCTLIST,"account1.csv") || die "can't open ACCTLIST: $!";
    while ( defined ($acctrow = <ACCTLIST>)) {
    chomp ($acctrow);
    push(@account_row,$acctrow);
    }
    close (ACCTLIST) || die "couldn't close ACCTLIST: $!";


    }


    sub balance_list
    {
    open (BALLIST,"balance1.csv") || die "can't open BALLIST: $!";
    while ( defined ($BALrow = <BALLIST>)) {
    chomp ($BALrow);

    my @balance = split(",",$BALrow);
    push(@account_id,$balance[1]);
    }
    close (BALLIST) || die "couldn't close BALLIST: $!";
    }

    #Any account with a balance greater than zero that was due before Oct 23, 2007 has an overdue value of 1
    #Get Account Numbers from the balance file that have balance > 0 and balance_due < Oct 23, 2007 in to an array

    #Any account without an overdue balance has a NULL overdue value (empty in the csv files) or an overdue value of 0
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,930
    Rep Power
    1225
    While it is possible to use arrays to do what you want, it would be the wrong data structure. You should use a hash as Keath suggested. I'd extend that to use a HoH (Hash of Hashes).

    Here's a short example that builds the %account hash.
    Code:
    #!/usr/bin/perl
    
    use 5.10.1;
    use strict;
    use warnings;
    use Data::Dumper;
    
    my %account;
    open my $a_fh, '<', 'a.csv' or die "failed to open a.csv $!";
    
    while (my $line = <$a_fh>) {
        chomp $line;
        my ($id, $account_number, $balance, $due) = split /\s+/, $line, 4;
        $account{$id} = {
            account  => $account_number,
            balance  => $balance,
            due_date => $due
        }
    }
    close $a_fh;
    
    print Dumper \%account;
    outputs
    Code:
    $VAR1 = {
              '1' => {
                       'due_date' => '11/28/2007 15:54',
                       'balance' => '482.29',
                       'account' => 'ar161429'
                     },
              '3' => {
                       'due_date' => '10/23/2007 15:54',
                       'balance' => '887.78',
                       'account' => 'ar106644'
                     },
              '2' => {
                       'due_date' => '10/9/2007 15:54',
                       'balance' => '266.93',
                       'account' => 'ar182364'
                     }
            };
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    4
    Rep Power
    0

    Incorrect reading of file


    I am getting following output for some reason it is not reading the file .

    #!/usr/bin/perl

    use strict;
    use warnings;
    use Data:umper;

    my %account;


    open my $a_fh, '<', 'a.csv' or die "failed to open a.csv $!";

    while (my $line = <$a_fh>) {
    chomp $line;
    my ($id, $account_number, $balance, $due) = split /\s+/, $line, 4;

    $account{$id} = {
    account => $account_number,
    balance => $balance,
    due_date => $due
    }
    }
    close $a_fh;


    print Dumper(\%account);

    a.csv
    id account_number balance balance_due
    1 ar161429 482.29 11/28/2007 15:54
    2 ar182364 266.93 10/9/2007 15:54
    3 ar106644 887.78 10/23/2007 15:54

    $VAR1 = {
    '3,ar106644,887.78,10/23/2007' => {
    'due_date' => undef,
    'balance' => undef,
    'account' => '15:54'
    },
    '1,ar161429,482.29,11/28/2007' => {
    'due_date' => undef,
    'balance' => undef,
    'account' => '15:54'
    },
    'id,account_number,balance,balance_due' => {
    'due_date' => undef,
    'balance' => undef,
    'account' => undef
    },
    '2,ar182364,266.93,10/9/2007' => {
    'due_date' => undef,
    'balance' => undef,
    'account' => '15:54'
    }
    };

    C:\Users\Dady\Documents\Opower>
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,930
    Rep Power
    1225
    The sample data you originally posted showed that the fields were separated by spaces but the output of the script shows that the fields are comma separated.

    change
    Code:
    my ($id, $account_number, $balance, $due) = split /\s+/, $line, 4;
    to
    Code:
    my ($id, $account_number, $balance, $due) = split /,/, $line, 4;
    The first row of the file will need to be handled differently and I'll leave that to the reader to work out.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    4
    Rep Power
    0

    To filter data by date


    Sorry my mistake . To filter data for date < 10/23/2007 what module. Should I use, can you post an example.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,930
    Rep Power
    1225
    Did you look at the 2 modules that keath suggested? They have examples in the documentation he linked to.

IMN logo majestic logo threadwatch logo seochat tools logo