#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    31
    Rep Power
    3

    Parsing CSV File


    Hi,

    I have a csv file like below :

    Apple,apple juice,"apple juice is good for health. It is made from apple juice, and it can be mixed with other fruits. Everyone should have apple juice.",10,


    I am using module TEXT::CSV to parse this file.

    I am having issue parsing

    ,"apple juice is good for health. It is made from apple juice, and it can be mixed with other fruits. Everyone should have apple juice."


    I am using the below

    TEXT::CSV->new({auto_diag => 1, quote_char => '"', escape_char => '"', allow_loose_quotes =>1})

    but still it not pring this csv part properly.

    Any suggestions ?

    Thanks
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    31
    Rep Power
    3
    Hi,

    Any suggestions ? Has anyone done the csv parsing for a line with , and . and " ?

    Thanks
  4. #3
  5. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,264
    Rep Power
    1810
    Shouldn't be a problem. Can you give us a full line at least rather than just one field?
  6. #4
  7. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,264
    Rep Power
    1810
    I don't know if you changed the example line. When I looked at it originally it began with a comma and had only one field.

    Anyway, the following works for me:

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    use Data::Dumper;
    use Text::CSV;
    
    my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
    my $file = 'test.csv';
    
    open my $fh, "<", $file or die "$file: $!";
    while (my $row = $csv->getline ($fh)) {
    	print Dumper $row;
    }
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    31
    Rep Power
    3
    Thanks-

    My exact line in the file is as follows :

    Fruit Basket,Fruit collection,"The fruits are many, from many countries. The fruits are sent by people from different countries. Supported Fruits: A-apple, B-Banana, B-Blueberry. Note:The flavor and taste of this fruit can differ based on countries."


    Here using the code you mentioned I get the output like

    Fruit Basket,Fruit collection,"The fruits are many, from many countries. The fruits are sent by people from different countries

    Supported Fruits:

    A-apple

    B-Banana

    B-Blueberry


    Note:The flavor and taste of this fruit can differ based on countries



    As you can see seems like between the double quotes there is some newline characters which seems to break this output.

    Any suggestions pls.

    Maybe replace the newline characters and comma between the double quotes - I tried this as well.

    But did not work.

    Thanks-
  10. #6
  11. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,264
    Rep Power
    1810
    Well, even that isn't your exact line, since it still hasn't preserved any newlines unfortunately.

    But I get the gist.

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    use Data::Dumper;
    use Text::CSV;
    
    my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
    my $file = 'test.csv';
    
    open my $fh, "<", $file or die "$file: $!";
    while (my $row = $csv->getline ($fh)) {
    	map { s/\n//g } @$row;
    	print Dumper $row;
    }
    You have to set binary mode to true, as I've done here. That's the only way to acknowledge that your file has embedded newlines within the lines. You'll need to remove the ones contained in the fields after the field parsing completed.

    I've used a regex on every field in this example. You could limit it to a single field to speed things up possibly, but it depends on your file.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    31
    Rep Power
    3
    Thanks-

    your code looks perfect that it should work. But when I try your exact code I dont get any output back. It is not reading the line with the dumper command.

    so I modified my code as below :

    use warnings;

    use IO::File;
    use Data:umper;
    use Text::CSV;

    my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
    my $file = 'test.csv';

    my $fh = new IO::File($file, "r") or die "Cant open file";

    while (my $row = <$fh>) {

    if ($csv->parse($row)) {
    map { s/\n//g } @$row;
    print $row;
    }
    }

    This is only printing like below :

    Supported Fruits:

    A-apple

    B-Banana

    B-Blueberry


    I am thinking this file even after taking it as binary has many escape characters and newline characters.

    Anyway we can clean this file.

    Appreciate your suggestions.
  14. #8
  15. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,264
    Rep Power
    1810
    Code:
    while (my $row = <$fh>) {
    You can't do that. The reason is that perl doesn't know if the newline is embedded within quotes or not, so will not be returning lines correctly. You have to use the following:

    Code:
    while (my $row = $csv->getline ($fh)) {
    As to why it doesn't work, I'd rather not guess. How about attaching an actual file that doesn't work for you.

IMN logo majestic logo threadwatch logo seochat tools logo