#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3

    Fetch specific entries


    Hi all

    My input file is

    PHP Code:
     TTDS00002    UniProt ID    P11229
    TTDS00001    Name    Muscarinic acetylcholine receptor 
    TTDS00001    Type of target    Successful target 
    TTDS00001    Synonyms    
    (m)AChR 
    TTDS00001    Synonyms    MAChR 
    TTDS00001    Disease    Alzheimer
    's disease 
    TTDS00001    Disease    Bronchospasm (histamine induced) 
    TTDS00001    Disease    Glaucoma 
    TTDS00001    Disease    Motion sickness 
    TTDS00001    Disease    Obstructive airway disease 
    TTDS00001    Disease    Organophosphate poisoning 
    TTDS00001    Disease    Schizophrenia 
    TTDS00001    Disease    Urinary incontinence 
    TTDS00001    Disease    Xerostomia 
    TTDS00001    BioChemical Class    G-protein coupled receptor (rhodopsin family) 
    TTDS00001    Pathway    Calcium signaling pathway 
    TTDS00001    Pathway    Neuroactive ligand-receptor interaction 
    TTDS00001    Pathway    Regulation of actin cytoskeleton 
    TTDS00001    Related US Patent    6,211,204 
    TTDS00001    Related US Patent    6,323,194 
    TTDS00001    Related US Patent    6,369,081 
    TTDS00001    Related US Patent    6,376,675 
    TTDS00001    Related US Patent    6,423,842 
    TTDS00001    Related US Patent    6,451,797 
    TTDS00001    Related US Patent    6,455,552 
    TTDS00001    Related US Patent    6,458,812 
    TTDS00001    Related US Patent    6,555,550 
    TTDS00001    Related US Patent    6,602,891 
    TTDS00001    Drug(s)    Bethanechol    DAP000263    Urinary retention    Approved 
    TTDS00001    Drug(s)    Trospium    DAP000342    Spasm    Approved 
    TTDS00001    Drug(s)    Oxyphencyclimine    DAP000835    Gastrointestinal disorders    Approved 
    TTDS00001    Drug(s)    Tridihexethyl    DAP000836    Acquired nystagmus    Approved 
    TTDS00001    Drug(s)    Anisotropine Methylbromide    DAP000837    Peptic ulcer disease    Approved 
    TTDS00001    Drug(s)    Hyoscyamine    DAP001108    Gastrointestinal disorders    Approved 
    TTDS00001    Drug(s)    Methantheline    DAP001109    Irritable bowel syndrome    Approved 
    TTDS00001    Drug(s)    Procyclidine    DAP001110    Parkinson'
    s disease    Approved 
    TTDS00001    Drug
    (s)    Cyclopentolate    DAP001111    Pediatric eye examinations    Approved 
    TTDS00001    Drug
    (s)    Ipratropium    DAP001112    Obstructive lung diseases    Approved 
    TTDS00001    Drug
    (s)    Pilocarpine    DAP001113    Glaucoma    Approved 
    TTDS00001    Drug
    (s)    Flavoxate    DAP001114    Muscle Relaxant    Approved 
    TTDS00001    Drug
    (s)    Mepenzolate    DAP001115    Peptic ulcer disease    Approved 
    TTDS00001    Drug
    (s)    Ispaghula    DAP001486    Irritable bowel syndrome    Approved 
    TTDS00001    Drug
    (s)    Mebeverine    DAP001494    Irritable bowel syndrome    Approved 
    TTDS00001    Drug
    (s)    Trihexyphenidyl HCl    DAP001532    Parkinson's Disease    Approved 
    TTDS00001    Antagonist    Trospium    DAP000342 
    TTDS00001    Antagonist    Hyoscyamine    DAP001108 
    TTDS00001    Antagonist    Methantheline    DAP001109 
    TTDS00001    Antagonist    Procyclidine    DAP001110 
    TTDS00001    Antagonist    Cyclopentolate    DAP001111 
    TTDS00001    Antagonist    Ipratropium    DAP001112 
    TTDS00001    Antagonist    Flavoxate    DAP001114 
    TTDS00001    Antagonist    Mepenzolate    DAP001115 
    TTDS00001    Antagonist    Ispaghula    DAP001486 
    TTDS00001    Antagonist    Mebeverine    DAP001494 
    TTDS00001    Antagonist    Trihexyphenidyl HCl    DAP001532 
    TTDS00001    Agonist    Bethanechol    DAP000263 
    TTDS00001    Agonist    Pilocarpine    DAP001113 
    TTDS00001    Binder    Oxyphencyclimine    DAP000835 
    TTDS00001    Binder    Tridihexethyl    DAP000836 
    TTDS00001    Binder    Anisotropine Methylbromide    DAP000837 
    TTDS00001    Drug(s)    Aclidinium bromide    DCL000677    Chronic obstructive pulmonary disease    Phase III 
    TTDS00001    Drug(s)    CHF 5407    DCL000750    Chronic obstructive pulmonary disease    Phase I 
    TTDS00001    Drug(s)    GSK233705    DCL000823    Chronic obstructive pulmonary disease    Phase II completed 
    TTDS00001    Drug(s)    NVA237    DCL000901    Chronic obstructive pulmonary disease    Phase III 
    TTDS00001    Drug(s)    Org-23366    DCL000911    Schizophrenia    No development reported 
    TTDS00001    Drug(s)    OrM3    DCL000913    Chronic obstructive pulmonary disease    Phase IIb 
    TTDS00001    Antagonist    Aclidinium bromide    DCL000677 
    TTDS00001    Antagonist    CHF 5407    DCL000750 
    TTDS00001    Antagonist    GSK233705    DCL000823 
    TTDS00001    Antagonist    NVA237    DCL000901 
    TTDS00001    Antagonist    Org-23366    DCL000911 
    TTDS00001    Antagonist    OrM3    DCL000913 
    TTDS00001    Multitarget    Org-23366    DCL000911 
    TTDS00001    Antagonist    Aprophen    DNC000245 
    TTDS00001    Antagonist    Benactyzine    DNC000293 
    TTDS00001    Antagonist    Hyoscine    DNC000757 
    TTDS00001    Antagonist    Hyoscyamine sulfate    DNC000758 
    TTDS00001    Antagonist    Ipratropium bromide    DNC000806 
    TTDS00001    Agonist    Muscarine    DNC000970 
    TTDS00001    Agonist    RS 86    DNC001236 
    TTDS00001    Target Validation    TTDS00001 
    TTDS00002    UniProt ID    P11229 
    TTDS00002    Name    Muscarinic acetylcholine receptor M1 
    TTDS00002    Type of target    Successful target 
    TTDS00002    Synonyms    M1 receptor 
    TTDS00002    Disease    Alzheimer'
    s disease 
    TTDS00002    Disease    Bronchospasm 
    (histamine induced
    TTDS00002    Disease    Cognitive deficits 
    TTDS00002    Disease    Schizophrenia 
    TTDS00002    
    Function    The muscarinic acetylcholine receptor mediates various cellular responsesincluding inhibition of adenylate cyclasebreakdown of phosphoinositides and modulation of potassium channels through the action of G proteins
    TTDS00002    Sequence    MNTSAPPAVSPNITVLAPGKGPWQVAFIGITTGLLSLATVTGNLLVLISFKVNTELKTVNNYFLLSLACADLIIGTFSMNLYTTYLLMGHWALGTLACDL   WLALDYVASNASVMNLLLISFDRYFSVTRPLSYRAKRTPRRAALMIGLAWLVSFVLWAPAILFWQYLVGERTVLAGQCYIQFLSQPIITFGTAMAAFYLP   VTVMCTLYWRIYRETENRARELAALQGSETPGKGGGSSSSSERSQPGAEGSPETPPGRCCRCCRAPRLLQAYSWKEEEEEDEGSMESLTSSEGEEPGSEV   VIKMPMVDPEAQAPTKQPPRSSPNTVKRPTKKGRDRAGKGQKPRGKEQLAKRKTFSLVKEKKAARTLSAILLAFILTWTPYNIMVLVSTFCKDCVPETLW   ELGYWLCYVNSTINPMCYALCNKAFRDTFRLLLLCRWDKRRWRKIPKRPGSVHRTPSRQC 
    TTDS00002    BioChemical 
    Class    G-protein coupled receptor (rhodopsin family
    TTDS00002    Pathway    Calcium signaling pathway 
    TTDS00002    Pathway    Neuroactive ligand
    -receptor interaction 
    TTDS00002    Pathway    Regulation of actin cytoskeleton 
    TTDS00002    Related US Patent    6
    ,288,068 
    TTDS00002    Related US Patent    6
    ,294,554 
    TTDS00002    Related US Patent    6
    ,627,645 
    TTDS00002    Drug
    (s)    Pirenzepine    DAP000492    Peptic ulcer disease    Approved 
    TTDS00002    Drug
    (s)    Glycopyrrolate    DAP001116    Anesthetic    Approved 
    TTDS00002    Drug
    (s)    Clidinium    DAP001117    Abdominal/stomach pain    Approved 
    TTDS00002    Drug
    (s)    Dicyclomine    DAP001118    Irritable bowel syndrome    Approved 
    TTDS00002    Drug
    (s)    Ethopropazine    DAP001119    Parkinson's disease    Approved 
    TTDS00002    Drug(s)    Cycrimine    DAP001120    Parkinson'
    s disease    Approved 
    TTDS00002    Drug
    (s)    Benztropine    DAP001121    Parkinson's disease    Approved 
    TTDS00002    Drug(s)    Trihexyphenidyl    DAP001122    Parkinson'
    s disease    Approved 
    TTDS00002    Drug
    (s)    Propantheline    DAP001123    Excessive sweating (hyperhidrosis)    Approved 
    TTDS00002    Drug
    (s)    Oxyphenonium    DAP001124    Spasm    Approved 
    TTDS00002    Drug
    (s)    Biperiden    DAP001125    Parkinson's disease    Approved 
    TTDS00002    Antagonist    Pirenzepine    DAP000492 
    TTDS00002    Antagonist    Glycopyrrolate    DAP001116 
    TTDS00002    Antagonist    Clidinium    DAP001117 
    TTDS00002    Antagonist    Dicyclomine    DAP001118 
    TTDS00002    Antagonist    Ethopropazine    DAP001119 
    TTDS00002    Antagonist    Benztropine    DAP001121 
    TTDS00002    Antagonist    Trihexyphenidyl    DAP001122 
    TTDS00002    Antagonist    Propantheline    DAP001123 
    TTDS00002    Antagonist    Oxyphenonium    DAP001124 
    TTDS00002    Antagonist    Biperiden    DAP001125 
    TTDS00002    Binder    Cycrimine    DAP001120 
    TTDS00002    Drug(s)    Talsaclidine isomer    DCL000268    Alzheimer'
    s disease    Discontinued 
    TTDS00002    Drug
    (s)    Sabcomeline hydrochloride    DCL000279    Cardiovascular diseases    Phase IIa 
    TTDS00002    Drug
    (s)    Talsaclidine fumarate    DCL000303    Alzheimer's disease    Discontinued 
    TTDS00002    Drug(s)    Xanomeline tartrate    DCL000328    Alzheimer'
    s disease    Phase II 
    TTDS00002    Drug
    (s)    GSK573719    DCL000381    Chronic Obstructive Pulmonary Disease (COPD)    Phase II 
    TTDS00002    Drug
    (s)    GSK961081    DCL000397    Chronic Obstructive Pulmonary Disease (COPD)    Phase II completed 
    TTDS00002    Drug
    (s)    GSK1034702    DCL000402    SchizophreniaDementia    Phase I completed 
    TTDS00002    Drug
    (s)    Darotropium    DCL000514    COPD    Suspended in Phase II in GSK 2009 Report 
    TTDS00002    Drug
    (s)    Darotropium 642444    DCL000515    COPD    Phase III 
    TTDS00002    Drug
    (s)    Revatropate    DCL000957    Chronic obstructive pulmonary disease    Discontinued in Phase I 
    TTDS00002    Antagonist    Revatropate    DCL000957 
    TTDS00002    Agonist    Talsaclidine isomer    DCL000268 
    TTDS00002    Agonist    Sabcomeline hydrochloride    DCL000279 
    TTDS00002    Agonist    Talsaclidine fumarate    DCL000303 
    TTDS00002    Agonist    Xanomeline tartrate    DCL000328 
    TTDS00002    Agonist    GSK573719    DCL000381 
    TTDS00002    Agonist    GSK961081    DCL000397 
    TTDS00002    Agonist    GSK1034702    DCL000402 
    TTDS00002    Agonist    Darotropium    DCL000514 
    TTDS00002    Agonist    Darotropium 
    642444    DCL000515 
    TTDS00002    Multitarget    GSK961081    DCL000397 
    TTDS00002    Multitarget    Revatropate    DCL000957 
    TTDS00002    Agonist    77
    -LH-28-1    DNC000099 
    TTDS00002    Agonist    AC
    -260584    DNC000137 
    TTDS00002    Agonist    AC
    -42    DNC000138 
    TTDS00002    Agonist    AF150
    (S)    DNC000165 
    TTDS00002    Agonist    AF267B    DNC000166 
    TTDS00002    Agonist    LY
    -593039    DNC000910 
    TTDS00002    Agonist    NGX
    -267    DNC001012 
    TTDS00002    Agonist    Sabcomeline    DNC001264 
    TTDS00002    Agonist    WAY
    -132983    DNC001510 
    TTDS00002    Inhibitor    Arecoline    DNC002508 
    TTDS00002    Inhibitor    Acetic acid 8
    -aza-bicyclo[3.2.1]oct-6-yl ester    DNC003640 
    TTDS00002    Inhibitor    Benzoic acid 8
    -aza-bicyclo[3.2.1]oct-6-yl ester    DNC003654 
    TTDS00002    Inhibitor    Propionic acid 8
    -aza-bicyclo[3.2.1]oct-6-yl ester    DNC003659 
    TTDS00002    Inhibitor    3
    -Methyl-7-pyrrolidin-1-yl-hept-5-yn-2-one    DNC004147 
    TTDS00002    Inhibitor    2
    -Methyl-6-pyrrolidin-1-yl-hex-4-ynal oxime    DNC004159 
    My sample expected out put to fetch specific entries in 6 coumns as follows:Kindly guide

    PHP Code:
    P11229    Talsaclidine isomer    DCL000268    Alzheimer's disease   Discontinued  agonist     
     P11229   Sabcomeline hydrochloride    DCL000279    Cardiovascular diseases    Phase IIa    agonist 
    P11229    Talsaclidine fumarate    DCL000303    Alzheimer'
    s disease        Discontinued   agonist
    P11229    Xanomeline tartrate    DCL000328    Alzheimer
    's disease    Phase II          agonist 
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Location
    39N 104.28W
    Posts
    158
    Rep Power
    3
    From what little information you have provided, it looks like the first 2 fields of the input are consistent. I would read each line, split it on spaces, and collect the first 2 elements.

    For the rest, you need to provide more information about how to parse the data. For instance, perhaps if the last word is ---? then the format is ***?
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    Thanks for reply.

    The above input file is the part of my big file in which same sort of data is present as TTDS0002....TTDS0003 and so on

    From the input I have to fetch the data in such a way that I will get Uniprot id, drug name, drug ID, disease name, approved/phase, agonist/inhibitor..

    If disease name is not given like the last section of input which I think u described as inconsistent.. In this case, the disease coulmn should be blank as well as approve/phase will column information will also be blank only Uniprot id, drug name, drug ID,agonist/inhibitor will be included.

    so as a sample output file I have included above is a small part of output expected.
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,837
    Rep Power
    480
    I asked my son, a pharmacist, and his wife, a pharmacy student, to examine your puzzle. If they understand how the rows map from input to output I'll let you know.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,837
    Rep Power
    480
    If this helps anyone's comprehension:
    Originally Posted by Drew
    I read the original post and I am still not sure how exactly the guy is supposed to go about actually looking at all of this. I understand that the output is different than the input, meaning, the columns don't match up. *What seems to be unique in all of this is the drug name. all of the information for the output file can be taken from rows that contain the drug name. Furthermore, there is standardization in how each element is reported: the Uniprot ID is always the same, the drug ID always starts with D...*
    Uniprot id, drug name, drug ID, disease name, approved/phase, agonist/inhibitor

    Now coming from essentially a layperson's standpoint in how to code this, I am not sure. *But it seems that if there was some way to tell the program to fetch all of the lines that include the specific drug names, then you could tell the program what to look for, how to reorganize it, and how to output it.
    [code]Code tags[/code] are essential for python code and Makefiles!
  10. #6
  11. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Location
    39N 104.28W
    Posts
    158
    Rep Power
    3
    It doesn't help me. But maybe this.
    @manigrover, forget programming for a minute. If you were a human being reading the input and hand-writing the output, how would you know what to write?
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    Thanks for reply

    I m shortening my input first for example if this is the part of input file-

    TTDS00001 UniProt ID P11229
    TTDS00001 Name Muscarinic acetylcholine receptor
    TTDS00001 Type of target Successful target
    TTDS00001 Synonyms (m)AChR
    TTDS00001 Synonyms MAChR
    TTDS00001 Disease Alzheimer's disease
    TTDS00001 Disease Bronchospasm (histamine induced)
    TTDS00001 Disease Glaucoma
    TTDS00001 Disease Motion sickness
    TTDS00001 Disease Obstructive airway disease
    TTDS00001 Disease Organophosphate poisoning
    TTDS00001 Disease Schizophrenia
    TTDS00001 Disease Urinary incontinence
    TTDS00001 Disease Xerostomia
    TTDS00001 BioChemical Class G-protein coupled receptor (rhodopsin family)
    TTDS00001 Pathway Calcium signaling pathway
    TTDS00001 Pathway Neuroactive ligand-receptor interaction
    TTDS00001 Pathway Regulation of actin cytoskeleton
    TTDS00001 Related US Patent 6,211,204
    TTDS00001 Related US Patent 6,323,194
    TTDS00001 Related US Patent 6,369,081
    TTDS00001 Related US Patent 6,376,675
    TTDS00001 Related US Patent 6,423,842
    TTDS00001 Related US Patent 6,451,797
    TTDS00001 Related US Patent 6,455,552
    TTDS00001 Related US Patent 6,458,812
    TTDS00001 Related US Patent 6,555,550
    TTDS00001 Related US Patent 6,602,891
    TTDS00001 Drug(s) Bethanechol DAP000263 Urinary retention Approved
    TTDS00001 Drug(s) Trospium DAP000342 Spasm Approved
    TTDS00001 Drug(s) Oxyphencyclimine DAP000835 Gastrointestinal disorders Approved
    TTDS00001 Drug(s) Tridihexethyl DAP000836 Acquired nystagmus Approved
    TTDS00001 Drug(s) Anisotropine Methylbromide DAP000837 Peptic ulcer disease Approved
    TTDS00001 Drug(s) Hyoscyamine DAP001108 Gastrointestinal disorders Approved
    TTDS00001 Drug(s) Methantheline DAP001109 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Procyclidine DAP001110 Parkinson's disease Approved
    TTDS00001 Drug(s) Cyclopentolate DAP001111 Pediatric eye examinations Approved
    TTDS00001 Drug(s) Ipratropium DAP001112 Obstructive lung diseases Approved
    TTDS00001 Drug(s) Pilocarpine DAP001113 Glaucoma Approved
    TTDS00001 Drug(s) Flavoxate DAP001114 Muscle Relaxant Approved
    TTDS00001 Drug(s) Mepenzolate DAP001115 Peptic ulcer disease Approved
    TTDS00001 Drug(s) Ispaghula DAP001486 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Mebeverine DAP001494 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Trihexyphenidyl HCl DAP001532 Parkinson's Disease Approved
    TTDS00001 Antagonist Trospium DAP000342
    TTDS00001 Antagonist Hyoscyamine DAP001108
    TTDS00001 Antagonist Methantheline DAP001109
    TTDS00001 Antagonist Procyclidine DAP001110
    TTDS00001 Antagonist Cyclopentolate DAP001111
    TTDS00001 Antagonist Ipratropium DAP001112
    TTDS00001 Antagonist Flavoxate DAP001114
    TTDS00001 Antagonist Mepenzolate DAP001115
    TTDS00001 Antagonist Ispaghula DAP001486
    TTDS00001 Antagonist Mebeverine DAP001494
    TTDS00001 Antagonist Trihexyphenidyl HCl DAP001532
    TTDS00001 Agonist Bethanechol DAP000263
    TTDS00001 Agonist Pilocarpine DAP001113
    TTDS00001 Binder Oxyphencyclimine DAP000835
    TTDS00001 Binder Tridihexethyl DAP000836
    TTDS00001 Binder Anisotropine Methylbromide DAP000837
    TTDS00001 Drug(s) Aclidinium bromide DCL000677 Chronic obstructive pulmonary disease Phase III
    TTDS00001 Drug(s) CHF 5407 DCL000750 Chronic obstructive pulmonary disease Phase I
    TTDS00001 Drug(s) GSK233705 DCL000823 Chronic obstructive pulmonary disease Phase II completed
    TTDS00001 Drug(s) NVA237 DCL000901 Chronic obstructive pulmonary disease Phase III
    TTDS00001 Drug(s) Org-23366 DCL000911 Schizophrenia No development reported
    TTDS00001 Drug(s) OrM3 DCL000913 Chronic obstructive pulmonary disease Phase IIb
    TTDS00001 Antagonist Aclidinium bromide DCL000677
    TTDS00001 Antagonist CHF 5407 DCL000750
    TTDS00001 Antagonist GSK233705 DCL000823
    TTDS00001 Antagonist NVA237 DCL000901
    TTDS00001 Antagonist Org-23366 DCL000911
    TTDS00001 Antagonist OrM3 DCL000913
    TTDS00001 Multitarget Org-23366 DCL000911
    TTDS00001 Antagonist Aprophen DNC000245
    TTDS00001 Antagonist Benactyzine DNC000293
    TTDS00001 Antagonist Hyoscine DNC000757
    TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
    TTDS00001 Antagonist Ipratropium bromide DNC000806
    TTDS00001 Agonist Muscarine DNC000970
    TTDS00001 Agonist RS 86 DNC001236
    TTDS00001 Target Validation TTDS00001
    Then part of expected sample output should contain columns like this

    Uniprot id Drug Disease Drug Id Approved/Phase Action

    P11229 Bethanechol DAP000263 Urinary retention Approved Agonist
    And,In the same way names of other drugs and diseases can come underneath it and if any information like disease name or approved/phase is missing for example drugs mentioned in the last that column will remain blank!
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,837
    Rep Power
    480
    Why isn't

    Pilocarpine DAP001113 Glaucoma Approved

    in the output?
    [code]Code tags[/code] are essential for python code and Makefiles!
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    Thanks for reply

    I just mentioned one drug. sorry but wasnt able to arrange all the drugs and diseases manually

    yes
    all these drugs and the one you mentioned, even others will also be in output.

    TTDS00001 Drug(s) Bethanechol DAP000263 Urinary retention Approved
    TTDS00001 Drug(s) Trospium DAP000342 Spasm Approved
    TTDS00001 Drug(s) Oxyphencyclimine DAP000835 Gastrointestinal disorders Approved
    TTDS00001 Drug(s) Tridihexethyl DAP000836 Acquired nystagmus Approved
    TTDS00001 Drug(s) Anisotropine Methylbromide DAP000837 Peptic ulcer disease Approved
    TTDS00001 Drug(s) Hyoscyamine DAP001108 Gastrointestinal disorders Approved
    TTDS00001 Drug(s) Methantheline DAP001109 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Procyclidine DAP001110 Parkinson's disease Approved
    TTDS00001 Drug(s) Cyclopentolate DAP001111 Pediatric eye examinations Approved
    TTDS00001 Drug(s) Ipratropium DAP001112 Obstructive lung diseases Approved
    TTDS00001 Drug(s) Pilocarpine DAP001113 Glaucoma Approved
    TTDS00001 Drug(s) Flavoxate DAP001114 Muscle Relaxant Approved
    TTDS00001 Drug(s) Mepenzolate DAP001115 Peptic ulcer disease Approved
    TTDS00001 Drug(s) Ispaghula DAP001486 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Mebeverine DAP001494 Irritable bowel syndrome Approved
    TTDS00001 Drug(s) Trihexyphenidyl HCl DAP001532 Parkinson's Disease Approved
    Even those which ar ementioned in the last or in between without disease name in front of it but only Action(Antagonist, Inhibitor) will also come

    like these

    TTDS00001 Antagonist CHF 5407 DCL000750
    TTDS00001 Antagonist GSK233705 DCL000823
    TTDS00001 Antagonist NVA237 DCL000901
    TTDS00001 Antagonist Org-23366 DCL000911
    TTDS00001 Antagonist OrM3 DCL000913
    TTDS00001 Multitarget Org-23366 DCL000911
    TTDS00001 Antagonist Aprophen DNC000245
    TTDS00001 Antagonist Benactyzine DNC000293
    TTDS00001 Antagonist Hyoscine DNC000757
    TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
    TTDS00001 Antagonist Ipratropium bromide DNC000806
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    If any one still have any question abut abo e ques plz let me know as some part of question already solved by perl script but I want to solve by python.
  20. #11
  21. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,837
    Rep Power
    480
    Answer rrashkin's crucial question, "If you were a human being reading the input and hand-writing the output, how would you know what to write?" There may be enough information between this thread and the other to solve the puzzle. However, you sadistically torment those whom you ask for help.

    You have a solution. Will you pay for someone to rewrite it in python? I'll learn perl for pay.

    Do we ignore lines that say "patent"? You haven't said.

    What separates the columns? Do columns always end with "words" ending in digits?
    [code]Code tags[/code] are essential for python code and Makefiles!
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    Hi Thanks for message

    I never thought to torment somebody.

    I answered and tried to give expected output as much I understood the questions.

    1)Do we ignore lines that say "patent"? You haven't said.

    Yes Patent information is not needed.

    2)
    What separates the columns? Do columns always end with "words" ending in digits?

    Columns are separated by tab separator

    3)
    Do columns always end with "words" ending in digits?
    No only last few columns in input file ends with digits because these do not contain disease name and approval/phase in front of it.

    Therefore, the sample expected out is like this

    Uniprot id Drug Drug Id Action Disease Approved/Phase

    P11229 Bethanechol DAP000263 Agonist Urinary retention Approved


    or even like this

    P11229 Ipratropium bromide DNC000806 Antagonist


    Int he second out put disease column and approval is blank because it is not mentioned in the input as well.
  24. #13
  25. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,837
    Rep Power
    480
    There are no tabs in your original post!
    Code:
    b	a	c
    Fields separated by tabs. A test to see if it's possible.

    [edit]I managed to post tabs.[/edit]
    [edit2]Indeed, tabs were the first separator I searched for.[/edit2]
    Last edited by b49P23TIvg; December 16th, 2012 at 07:51 PM. Reason: test conclusion
    [code]Code tags[/code] are essential for python code and Makefiles!
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    63
    Rep Power
    3
    Ok

    If output columns will be tab separated, that would be good.

    Thanks

IMN logo majestic logo threadwatch logo seochat tools logo