#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2013
    Posts
    1
    Rep Power
    0

    Rearrange In Perl Pleaaasee


    Hi everybody,
    im learning perl, in the next tinysql.xml text i only want the <TSeq_accver>HQ288224.1</TSeq_accver>, <TSeq_orgname>Eutypella vitis</TSeq_orgname> and <TSeq_sequence>TCTCCGTTGGTGAACCAGCGGAGGGATCATTAAAGAGTAGTTTTTACAACAACTCCAAACCCATGTGAACTTACCTATGTTGCCT CGGCGGGGAAACTACCCTGTAGCTACCCTGTAGCTACCCTGTAAGGACTACTCGTCGACGGACCATTAAACTCTGTTTTTCTATGAAACTTCTGAGTGTT TTAACTTAATAAATTAAAACTTTCAACAACGGATCTCTTGGTTCTGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCA GTGAATCATCGAATCTTTGAACGCACATTGCGCCCATTAGTATTCTAGTGGGCATGCCTGTTCGAGCGTCATTTCGACCATCAAGCCCTATTTGCTTGGC GTTGGGAGCTTACCCTGCAGTTGCGGGATAACTCCTCAAATATATTGGCGGAGTCGCGGAGACCCTAAGCGTAGTAATTCTTCTCGCTTTAGTAGTGTTA ACGCTGGCATCTGGCCACTAAACCCCTAATTTTTATAGGTTTGACCTCGGATCAGGTAGGAATACCCGCTGAACTTAA</TSeq_sequence>

    ######

    -<TSeqSet>-<TSeq><TSeq_seqtype value="nucleotide"/><TSeq_gi>336440873</TSeq_gi><TSeq_accver>HQ288224.1</TSeq_accver><TSeq_taxid>140563</TSeq_taxid><TSeq_orgname>Eutypella vitis</TSeq_orgname><TSeq_gi>336440873</TSeq_gi><TSeq_defline>Eutypella vitis strain UCD2291AR 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, complete sequence; and 28S ribosomal RNA gene, partial sequence</TSeq_defline><TSeq_length>563</TSeq_length><TSeq_sequence>TCTCCGTTGGTGAACCAGCGGAGGGATCATTAAAGAGTAGTTTTTACAACAACTCCAAACCCATGTGAACTTA CCTATGTTGCCTCGGCGGGGAAACTACCCTGTAGCTACCCTGTAGCTACCCTGTAAGGACTACTCGTCG</TSeq_sequence></TSeq></TSeqSet>

    i made an script

    while ( my $lines = <TINY> ) {
    foreach ($lines) {
    if (m/<TSeq_accver>.*<\/TSeq_accver>/) {
    $lines =~ s/<TSeq_accver>//g and $lines =~ s/<\/TSeq_accver>//g;
    $lines =~ s/ //g;
    chomp($lines);

    print New_File ">$lines\_";
    } elsif (m/<TSeq_orgname>.*<\/TSeq_orgname>/) {
    $lines =~ s/<TSeq_orgname>//g and $lines =~ s/<\/TSeq_orgname>//g;
    $lines =~ s/\s{2}//g;
    chomp($lines);
    print New_File "$lines\n";
    } elsif (m/<TSeq_sequence>.*<\/TSeq_sequence>/) {
    $lines =~ s/<TSeq_sequence>//g and $lines =~ s/<\/TSeq_sequence>//g;
    $lines =~ s/ //g;
    chomp($lines);
    print New_File "$lines\n";
    }
    }
    }
    close TINY;
    close New_File;

    it deletes all the text except what i want, but put me the <TSeq_accver>.*<\/TSeq_accver> before the <TSeq_orgname>.*</TSeq_orgname>. I WANT THE OPPPSITE, THAT <TSeq_orgname>.*</TSeq_orgname> GONNA BE AT FIRST

    PLEASEEEEE HELP MEEEEEEEEE
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,930
    Rep Power
    1225
    Please use the [ code ] tags when posting your code. It's the # button.

    You should not be parsing xml files with simple regex's. Instead, you should be using one of the many xml parser modules on cpan.

    XML::Simple - Easily read/write XML (esp config files)

    XML::Twig - A perl module for processing huge XML documents in tree mode.

    XML::LibXML - Perl Binding for libxml2

    General search for XML modules

IMN logo majestic logo threadwatch logo seochat tools logo