#1
  1. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jul 2003
    Location
    Prague, Czech Republic
    Posts
    869
    Rep Power
    71

    Unexpected XPath return value


    Hi,
    I'm trying to extract nodes out of a simple XML using XML::LibXML, but I'm receiving a list of "1"s insted of XML::LibXML::Node objects:
    Code:
    my $p = XML::LibXML->new();
    my $dom = $p->parse_file($ARGV[0]);
    my $data_elem;
    my @d = $dom->findnodes('//*[name()="record"]');
    print join("|",@d)."\n";
    exit 0;
    
    [zb@p01 ~]$ cat abc.xml
    <?xml version="1.0"?>
    <data>
    <record num="1" name="John"/>
    <record num="2" name="Ian"/>
    <record num="3" name="Dana"/>
    <record num="4" name="John"/>
    </data>
    [zb@p01 ~]$
    [zb@p01 ~]$ ./test.pl abc.xml
    1|1|1|1
    [zb@p01 ~]$
    I would expect the array to be filled by node references eg.: XML::LibXML::Node=SCALAR(0x1008b20).

    Code:
    [zb@p01 ~]$ rpm -qa|grep -i -e "^perl"|grep "XML"
    perl-XML-Parser-2.36-7.el6.x86_64
    perl-XML-SAX-Writer-0.50-8.el6.noarch
    perl-XML-LibXML-1.70-5.el6.x86_64
    perl-XML-Checker-0.13-1.el6.rf.noarch
    perl-XML-Simple-2.18-6.el6.noarch
    perl-XML-XPath-1.13-10.el6.noarch
    perl-XML-Filter-BufferText-1.01-8.el6.noarch
    perl-XML-RegExp-0.03-7.el6.noarch
    perl-XML-Grove-0.46alpha-40.el6.noarch
    perl-XML-Dumper-0.81-6.el6.noarch
    perl-XML-DOM-1.44-7.el6.noarch
    perl-XML-Twig-3.34-1.el6.noarch
    perl-XML-NamespaceSupport-1.10-3.el6.noarch
    perl-XML-SAX-0.96-7.el6.noarch
    [zb@p01 ~]$ cat /etc/centos-release
    CentOS release 6.2 (Final)
    [zb@p01 ~]$
    
    libxml2-2.7.6-4.el6_2.4.x86_64
    Thank you
  2. #2
  3. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,259
    Rep Power
    1810
    Made a tiny change to load directly from a filename:
    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    
    use Data::Dumper;
    use XML::LibXML;
    
    my $p = XML::LibXML->new();
    my $doc = $p->parse_file('tst.xml');
    
    my @d = $doc->findnodes('//*[name()="record"]');
    print join("|",@d)."\n";
    result:
    <record num="1" name="John"/>|<record num="2" name="Ian"/>|<record num="3" name="Dana"/>|<record num="4" name="John"/>
    You might want to check your encoding. Since you didn't define it in the xml document, it's expected to be utf-8.

    ---

    Interesting behavior though. I added a few lines:

    Code:
    print Dumper \@d;
    
    my $results = $doc->findnodes('//*[name()="record"]');
    print Dumper $results;
    
    foreach my $n (@d) {
    	print "$n\n";
    }
    result:
    Code:
    $VAR1 = [
              bless( do{\(my $o = '140406530683760')}, 'XML::LibXML::Element' ),
              bless( do{\(my $o = '140406530233184')}, 'XML::LibXML::Element' ),
              bless( do{\(my $o = '140406531567024')}, 'XML::LibXML::Element' ),
              bless( do{\(my $o = '140406528811984')}, 'XML::LibXML::Element' )
            ];
    $VAR1 = bless( [
                     bless( do{\(my $o = '140406530683760')}, 'XML::LibXML::Element' ),
                     bless( do{\(my $o = '140406530233184')}, 'XML::LibXML::Element' ),
                     bless( do{\(my $o = '140406531567024')}, 'XML::LibXML::Element' ),
                     bless( do{\(my $o = '140406528811984')}, 'XML::LibXML::Element' )
                   ], 'XML::LibXML::NodeList' );
    <record num="1" name="John"/>
    <record num="2" name="Ian"/>
    <record num="3" name="Dana"/>
    <record num="4" name="John"/>
    @d does contain objects, but when they are printed individually they are converted automatically to string literals.

    It's not uncommon in many object-oriented languages to have a text or description method, that when defined will be used to print a string description of an object. I don't know how to do that in perl though.

    The LibXML documentation has a few notes along the lines of:
    Other expressions might return an XML::LibXML::Boolean object, or an XML::LibXML::Literal object (a string). Each of those objects uses Perl's overload feature to "do the right thing" in different contexts.
    The only context I could imagine is the use of 'wantarray', which might give the node the hint it needs if being called from print or join. Anyone know what's happening here?

IMN logo majestic logo threadwatch logo seochat tools logo