#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3

    Request to change perl code according to expected output


    Hi all,

    I have an input file like this (it;s a small part of huge file)

    TTDS00002 UniProt ID P11229
    TTDS00002 Name Muscarinic acetylcholine receptor M1
    TTDS00002 Type of target Successful target
    TTDS00002 Synonyms M1 receptor
    TTDS00002 Disease Alzheimer's disease
    TTDS00002 Disease Bronchospasm (histamine induced)
    TTDS00002 Disease Cognitive deficits
    TTDS00002 Disease Schizophrenia
    TTDS00002 Function The muscarinic acetylcholine receptor mediates various cellular responses, including inhibition of adenylate cyclase, breakdown of phosphoinositides and modulation of potassium channels through the action of G proteins.
    TTDS00002 Sequence MNTSAPPAVSPNITVLAPGKGPWQVAFIGITTGLLSLATVTGNLLVLISFKVNTELKTVNNYFLLSLACADLIIGTFSMNLYTTYLLMGHWALGTLACDL WLALDYVASNASVMNLLLISFDRYFSVTRPLSYRAKRTPRRAALMIGLAWLVSFVLWAPAILFWQYLVGERTVLAGQCYIQFLSQPIITFGTAMAAFYLP VTVMCTLYWRIYRETENRARELAALQGSETPGKGGGSSSSSERSQPGAEGSPETPPGRCCRCCRAPRLLQAYSWKEEEEEDEGSMESLTSSEGEEPGSEV VIKMPMVDPEAQAPTKQPPRSSPNTVKRPTKKGRDRAGKGQKPRGKEQLAKRKTFSLVKEKKAARTLSAILLAFILTWTPYNIMVLVSTFCKDCVPETLW ELGYWLCYVNSTINPMCYALCNKAFRDTFRLLLLCRWDKRRWRKIPKRPGSVHRTPSRQC
    TTDS00002 BioChemical Class G-protein coupled receptor (rhodopsin family)
    TTDS00002 Pathway Calcium signaling pathway
    TTDS00002 Pathway Neuroactive ligand-receptor interaction
    TTDS00002 Pathway Regulation of actin cytoskeleton
    TTDS00002 Related US Patent 6,288,068
    TTDS00002 Related US Patent 6,294,554
    TTDS00002 Related US Patent 6,627,645
    TTDS00002 Drug(s) Pirenzepine DAP000492 Peptic ulcer disease Approved
    TTDS00002 Drug(s) Glycopyrrolate DAP001116 Anesthetic Approved
    TTDS00002 Drug(s) Clidinium DAP001117 Abdominal/stomach pain Approved
    TTDS00002 Drug(s) Dicyclomine DAP001118 Irritable bowel syndrome Approved
    TTDS00002 Drug(s) Ethopropazine DAP001119 Parkinson's disease Approved
    TTDS00002 Drug(s) Cycrimine DAP001120 Parkinson's disease Approved
    TTDS00002 Drug(s) Benztropine DAP001121 Parkinson's disease Approved
    TTDS00002 Drug(s) Trihexyphenidyl DAP001122 Parkinson's disease Approved
    TTDS00002 Drug(s) Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved
    TTDS00002 Drug(s) Oxyphenonium DAP001124 Spasm Approved
    TTDS00002 Drug(s) Biperiden DAP001125 Parkinson's disease Approved
    TTDS00002 Antagonist Pirenzepine DAP000492
    TTDS00002 Antagonist Glycopyrrolate DAP001116
    TTDS00002 Antagonist Clidinium DAP001117
    TTDS00002 Antagonist Dicyclomine DAP001118
    TTDS00002 Antagonist Ethopropazine DAP001119
    TTDS00002 Antagonist Benztropine DAP001121
    TTDS00002 Antagonist Trihexyphenidyl DAP001122
    TTDS00002 Antagonist Propantheline DAP001123
    TTDS00002 Antagonist Oxyphenonium DAP001124
    TTDS00002 Antagonist Biperiden DAP001125
    TTDS00002 Binder Cycrimine DAP001120
    TTDS00002 Drug(s) Talsaclidine isomer DCL000268 Alzheimer's disease Discontinued
    TTDS00002 Drug(s) Sabcomeline hydrochloride DCL000279 Cardiovascular diseases Phase IIa
    TTDS00002 Drug(s) Talsaclidine fumarate DCL000303 Alzheimer's disease Discontinued
    TTDS00002 Drug(s) Xanomeline tartrate DCL000328 Alzheimer's disease Phase II
    TTDS00002 Drug(s) GSK573719 DCL000381 Chronic Obstructive Pulmonary Disease (COPD) Phase II
    TTDS00002 Drug(s) GSK961081 DCL000397 Chronic Obstructive Pulmonary Disease (COPD) Phase II completed
    TTDS00002 Drug(s) GSK1034702 DCL000402 Schizophrenia, Dementia Phase I completed
    TTDS00002 Drug(s) Darotropium DCL000514 COPD Suspended in Phase II in GSK 2009 Report
    TTDS00002 Drug(s) Darotropium + 642444 DCL000515 COPD Phase III
    TTDS00002 Drug(s) Revatropate DCL000957 Chronic obstructive pulmonary disease Discontinued in Phase I
    TTDS00002 Antagonist Revatropate DCL000957
    TTDS00002 Agonist Talsaclidine isomer DCL000268
    TTDS00002 Agonist Sabcomeline hydrochloride DCL000279
    TTDS00002 Agonist Talsaclidine fumarate DCL000303
    TTDS00002 Agonist Xanomeline tartrate DCL000328
    TTDS00002 Agonist GSK573719 DCL000381
    TTDS00002 Agonist GSK961081 DCL000397
    TTDS00002 Agonist GSK1034702 DCL000402
    TTDS00002 Agonist Darotropium DCL000514
    TTDS00002 Agonist Darotropium + 642444 DCL000515
    TTDS00002 Multitarget GSK961081 DCL000397
    TTDS00002 Multitarget Revatropate DCL000957
    TTDS00002 Agonist 77-LH-28-1 DNC000099
    TTDS00002 Agonist AC-260584 DNC000137
    TTDS00002 Agonist AC-42 DNC000138
    TTDS00002 Agonist AF150(S) DNC000165
    TTDS00002 Agonist AF267B DNC000166
    TTDS00002 Agonist LY-593039 DNC000910
    TTDS00002 Agonist NGX-267 DNC001012
    TTDS00002 Agonist Sabcomeline DNC001264
    TTDS00002 Agonist WAY-132983 DNC001510
    TTDS00002 Inhibitor Arecoline DNC002508
    TTDS00002 Inhibitor Acetic acid 8-aza-bicyclo[3.2.1]oct-6-yl ester DNC003640
    TTDS00002 Inhibitor Benzoic acid 8-aza-bicyclo[3.2.1]oct-6-yl ester DNC003654
    TTDS00002 Inhibitor Propionic acid 8-aza-bicyclo[3.2.1]oct-6-yl ester DNC003659
    TTDS00002 Inhibitor 3-Methyl-7-pyrrolidin-1-yl-hept-5-yn-2-one DNC004147
    TTDS00002 Inhibitor 2-Methyl-6-pyrrolidin-1-yl-hex-4-ynal oxime DNC004159
    TTDS00002 Inhibitor ISOCLOZAPINE DNC004166
    TTDS00002 Inhibitor SB-202026 DNC004272
    TTDS00002 Inhibitor HIMBACINE DNC004995
    TTDS00002 Inhibitor RR(17)PZ DNC005944
    TTDS00002 Inhibitor Bo(15)PZ DNC005945
    TTDS00002 Inhibitor DIFLUOROBENZTROPINE DNC005986
    TTDS00002 Inhibitor BI-1356 DNC007901
    TTDS00002 Inhibitor FM1-10 DNC008187
    TTDS00002 Inhibitor FM1-43 DNC008188
    TTDS00002 Inhibitor A-987306 DNC008996
    TTDS00002 Inhibitor GNF-PF-5618 DNC009476
    TTDS00002 Inhibitor CREMASTRINE DNC009504
    TTDS00002 Inhibitor 1,1-diphenyl-2-(3-tropanyl)ethanol DNC009866
    TTDS00002 Inhibitor R-dimethindene DNC009877
    TTDS00002 Inhibitor Tiotropium Bromide DNC009882
    TTDS00002 Inhibitor XANOMELINE DNC011170
    TTDS00002 Inhibitor 4-(4-butylpiperidin-1-yl)-1-o-tolylbutan-1-one DNC011171
    TTDS00002 Inhibitor 1-Methyl-1-(4-pyrrolidin-1-yl-but-2-ynyl)-urea DNC011427
    TTDS00002 Inhibitor ISOLOXAPINE DNC011498
    TTDS00002 Inhibitor 1'-Benzyl-3-phenyl-[3,4']bipiperidinyl-2,6-dione DNC011500
    TTDS00002 Inhibitor CARAMIPEN DNC011755
    TTDS00002 Inhibitor FLUMEZAPINE DNC011857
    TTDS00002 Inhibitor AMINOBENZTROPINE DNC011950
    TTDS00002 Inhibitor 2-(4-Diethylamino-but-2-ynyl)-isoindole-1,3-dione DNC012005
    TTDS00002 Inhibitor 3-Tetrazol-2-yl-1-aza-bicyclo[2.2.2]octane DNC012098
    TTDS00002 Inhibitor SULFOARECOLINE DNC012122
    TTDS00002 Inhibitor 6-Dimethylamino-2-methyl-hex-4-ynal oxime DNC012306
    TTDS00002 Inhibitor 7-Pyrrolidin-1-yl-hept-5-yn-2-one DNC012322
    TTDS00002 Inhibitor 7-Dimethylamino-3-methyl-hept-5-yn-2-one DNC012323
    TTDS00002 Inhibitor 7-Pyrrolidin-1-yl-hept-5-yn-2-one oxime DNC012330
    TTDS00002 Inhibitor 7-Dimethylamino-hept-5-yn-2-one DNC012350
    TTDS00002 Inhibitor 7-Dimethylamino-hept-5-yn-2-one oxime DNC012351
    TTDS00002 Inhibitor N-(4-Dimethylamino-but-2-ynyl)-N-methyl-acetamide DNC012363
    TTDS00002 Inhibitor ACECLIDINE DNC012502
    TTDS00002 Inhibitor N-methoxyquinuclidine-3-carboximidoyl fluoride DNC012588
    TTDS00002 Inhibitor BRL-55473 DNC012594
    TTDS00002 Inhibitor N-methoxyquinuclidine-3-carboximidoyl chloride DNC012616
    TTDS00002 Inhibitor 2,8-Dimethyl-1-oxa-8-aza-spiro[4.5]decan-3-one DNC012765
    TTDS00002 Inhibitor 3alpha-(bis-chloro-phenylmethoxy)tropane DNC013136
    TTDS00002 Inhibitor 3-(3-benzylamino)-piperidin-2-one DNC013219
    TTDS00002 Target Validation TTDS00002
    TTDS00003 UniProt ID P08172
    TTDS00003 Name Muscarinic acetylcholine receptor M2
    TTDS00003 Type of target Successful target
    My expected out put is :

    P11229 Pirenzepine DAP000492 Peptic ulcer disease Approved
    P11229 Glycopyrrolate DAP001116 Anesthetic Approved
    P11229 Clidinium DAP001117 Abdominal stomach pain Approved
    P11229 Dicyclomine DAP001118 Irritable bowel syndrome Approved
    P11229 Ethopropazine DAP001119 Parkinson's disease Approved
    P11229 Cycrimine DAP001120 Parkinson's disease Approved
    P11229 Benztropine DAP001121 Parkinson's disease Approved
    P11229 Trihexyphenidyl DAP001122 Parkinson's disease Approved
    P11229 Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved
    P11229 Oxyphenonium DAP001124 Spasm Approved
    P11229 Biperiden DAP001125 Parkinson's disease Approved
    P11229 Talsaclidine isomer DCL000268 Alzheimer's disease Discontinued
    P11229 Sabcomeline hydrochloride DCL000279 Cardiovascular diseases Phase IIa
    P11229 Talsaclidine fumarate DCL000303 Alzheimer's disease Discontinued
    P11229 Xanomeline tartrate DCL000328 Alzheimer's disease Phase II
    P11229 GSK573719 DCL000381 Chronic Obstructive Pulmonary Disease (COPD) Phase II
    P11229 GSK961081 DCL000397 Chronic Obstructive Pulmonary Disease (COPD) Phase II completed
    P11229 GSK1034702 DCL000402 Schizophrenia, Dementia Phase I completed
    P11229 Darotropium DCL000514 COPD Suspended in Phase II in GSK 2009 Report
    P11229 Darotropium + 642444 DCL000515 COPD Phase III
    P11229 Revatropate DCL000957 Chronic obstructive pulmonary disease Discontinued in Phase I
    P11229 Trospium DAP000342
    P11229 Hyoscyamine DAP001108
    P11229 Methantheline DAP001109
    P11229 Procyclidine DAP001110
    P11229 Cyclopentolate DAP001111
    P11229 Ipratropium DAP001112
    P11229 Flavoxate DAP001114
    P11229 Mepenzolate DAP001115
    P11229 Ispaghula DAP001486
    P11229 Mebeverine DAP001494
    P11229 Trihexyphenidyl HCl DAP001532
    P11229 Bethanechol DAP000263
    P11229 Pilocarpine DAP001113
    P11229 Oxyphencyclimine DAP000835
    P11229 Tridihexethyl DAP000836
    P11229 Anisotropine Methylbromide DAP000837
    P11229 Aclidinium bromide DCL000677 Chronic obstructive pulmonary disease Phase III
    P11229 CHF 5407 DCL000750 Chronic obstructive pulmonary disease Phase I
    P11229 GSK233705 DCL000823 Chronic obstructive pulmonary disease Phase II completed
    P11229 NVA237 DCL000901 Chronic obstructive pulmonary disease Phase III
    P11229 Org-23366 DCL000911 Schizophrenia No development reported
    P11229 OrM3 DCL000913 Chronic obstructive pulmonary disease Phase IIb
    P11229 Aclidinium bromide DCL000677
    P11229 CHF 5407 DCL000750
    P11229 GSK233705 DCL000823
    P11229 NVA237 DCL000901
    P11229 Org-23366 DCL000911
    P11229 OrM3 DCL000913
    P11229 Org-23366 DCL000911
    P11229 Aprophen DNC000245
    P11229 Benactyzine DNC000293
    P11229 Hyoscine DNC000757
    P11229 Hyoscyamine sulfate DNC000758
    P11229 Ipratropium bromide DNC000806
    P11229 Muscarine DNC000970
    P11229 RS 86 DNC001236

    I am using this perl script

    #!/usr/bin/perl -w

    use strict;

    if ($#ARGV < 1) {
    print "Usage: $0 input_file output_file\n";
    exit 0;
    }

    my $input_file = $ARGV[0];
    my $output_file = $ARGV[1];

    my $prev_ttds_num = '';
    my @diseases = ();
    my $drug_name = '';

    open my $IFH, '<', $input_file or die "$!\n";
    open my $OFH, '>', $output_file or die "$!\n";

    while (my $line = <$IFH>) {
    chomp $line;
    next if $line eq '';
    my @array = split /\t/, $line;

    my $ttds_num = shift @array;
    my $rec_type = shift @array;

    if ($ttds_num ne $prev_ttds_num) {
    if ($prev_ttds_num ne '') {
    dump_data($OFH, $drug_name, \@diseases);
    }
    $prev_ttds_num = $ttds_num;
    @diseases = ();
    $drug_name = '';
    }

    if ($rec_type eq 'Name') {
    $drug_name = shift @array;
    }
    elsif ($rec_type eq 'Drug(s)') {
    my $part_record = join("\t", @array);
    push @diseases, $part_record;
    }
    }

    dump_data($OFH, $drug_name, \@diseases);
    close $OFH;
    close $IFH;

    print "Done\n";
    exit 0;

    ######################################################################
    # #
    # S U B R O U T I N E S #
    # #
    ######################################################################

    ##
    # @brief Routine to dump out multiple records
    # @param FH - A file handle to write data out
    # @param drug_name - The name of the drug
    # @param disease_ref - An array references to list of disease data
    # @return undef
    #
    sub dump_data {
    my ($FH, $drug_name, $disease_ref) = @_;
    return if ($drug_name eq '');

    foreach my $disease (@{ $disease_ref }) {
    print ${FH} "$drug_name\t$disease\n";
    }

    return;
    }
    But my output is

    P11229 Pirenzepine DAP000492 Peptic ulcer disease Approved
    P11229 Glycopyrrolate DAP001116 Anesthetic Approved
    P11229 Clidinium DAP001117 Abdominal/stomach pain Approved
    P11229 Dicyclomine DAP001118 Irritable bowel syndrome Approved
    P11229 Ethopropazine DAP001119 Parkinson's disease Approved
    P11229 Cycrimine DAP001120 Parkinson's disease Approved
    P11229 Benztropine DAP001121 Parkinson's disease Approved
    P11229 Trihexyphenidyl DAP001122 Parkinson's disease Approved
    P11229 Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved
    P11229 Oxyphenonium DAP001124 Spasm Approved
    P11229 Biperiden DAP001125 Parkinson's disease Approved
    P11229 Talsaclidine isomer DCL000268 Alzheimer's disease Discontinued
    P11229 Sabcomeline hydrochloride DCL000279 Cardiovascular diseases Phase IIa
    P11229 Talsaclidine fumarate DCL000303 Alzheimer's disease Discontinued
    P11229 Xanomeline tartrate DCL000328 Alzheimer's disease Phase II
    P11229 GSK573719 DCL000381 Chronic Obstructive Pulmonary Disease (COPD) Phase II
    P11229 GSK961081 DCL000397 Chronic Obstructive Pulmonary Disease (COPD) Phase II completed
    P11229 GSK1034702 DCL000402 Schizophrenia, Dementia Phase I completed
    P11229 Darotropium DCL000514 COPD Suspended in Phase II in GSK 2009 Report
    P11229 Darotropium + 642444 DCL000515 COPD Phase III
    P11229 Revatropate DCL000957 Chronic obstructive pulmonary disease Discontinued in Phase I

    Kindly help regarding changes in perl script to get desired output.

    Mani
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,954
    Rep Power
    1225
    See my request for more info on your cross post at perlguru.
  4. #3
  5. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,648
    Rep Power
    4248
    Dude, it has been over a year since I wrote that code for you, and you're still using it unmodified? You should have learned at least a little bit of programming by now.

    Come to think of it, I think you asked the same question about 6 months ago in two different languages:
    http://forums.devshed.com/perl-progr...ta-935068.html
    http://forums.devshed.com/python-pro...es-935989.html

    Comments on this post

    • Laurent_R agrees : It looks like I was naive to think that Manigrover had made progress. See my post below.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Location
    Paris area, France
    Posts
    842
    Rep Power
    496
    Originally Posted by Scorpions4ever
    Dude, it has been over a year since I wrote that code for you, and you're still using it unmodified? You should have learned at least a little bit of programming by now.
    Oh, I see. When seeing Mani's code, I thought: "Gee, Mani has made progresses, he (she) is now posting some code to be corrected, rather than asking for ready-made code." It seems I was very naive.

IMN logo majestic logo threadwatch logo seochat tools logo