#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    22
    Rep Power
    0

    File header change


    Hello all;

    I have a multi fasta file (with header[>] and sequence) in the following format:

    >contig_1 # 498 # 1826 # 1 # ID=1_1;partial=00;start_type=ATG;rbs_motif=AGxAGG/AGGxGG;rbs_spacer=5-10bp;gc_cont=0.406
    MNLTFDYTKEPSRDVLCIDVKSFYASVECVERG
    LDPLKTMLVVMSNSENSGGLVLAASPM
    >contig_2 # 1823 # 2173 # 1 # ID=1_2;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.311
    MKQNRKEFSSYFSRSIKQNKPLYLLLMSSETNPF
    PRPVIGTFRGYVEENKIIIGEDSYSI
    ....
    ...

    and i want to edit the header lines as just a simple number count:

    >1
    MNLTFDYTKEPSRDVLCIDVKSFYASVECVERG
    LDPLKTMLVVMSNSENSGGLVLAASPM
    >2
    MKQNRKEFSSYFSRSIKQNKPLYLLLMSSETNPF
    PRPVIGTFRGYVEENKIIIGEDSYSI
    ....
    ...

    may be it's a too simple questions, but I only started to learn python and got stuck..
    any idea how to do it?
    thanks in advance..
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,895
    Rep Power
    481
    I assume you want the number immediately following >contig_
    Code:
    data = '''>contig_1 # 498 # 1826 # 1 # ID=1_1;partial=00;start_type=ATG;rbs_motif=AGxAGG/AGGxGG;rbs_spacer=5-10bp;gc_cont=0.406
    MNLTFDYTKEPSRDVLCIDVKSFYASVECVERG
    LDPLKTMLVVMSNSENSGGLVLAASPM
    >contig_2 # 1823 # 2173 # 1 # ID=1_2;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.311
    MKQNRKEFSSYFSRSIKQNKPLYLLLMSSETNPF
    PRPVIGTFRGYVEENKIIIGEDSYSI
    '''
    
    def rewrite(LINE):
        S = '>contig_'
        return (LINE.startswith(S) and ('>' + LINE[len(S):].split()[0])) or LINE
    
        # traditionally this would be written as
    
        #if LINE.startswith(S):
        #    return '>' + LINE[len(S):].split()[0]
        #else:
        #    return LINE
    
    for L in data.split('\n'):
        print(rewrite(L))
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    22
    Rep Power
    0

    thank u


    thank u very much, but I actually want to add after each > simple counter 1, 2... etc..
    like :
    >1
    MNLTFDY
    ..
    >2
    MKQNRKE
    ..
    >3
    RTV
    ..

    but ur code will do nicely,, thank u..
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,895
    Rep Power
    481
    Code:
    class count:
        def __init__(self,start = 0):
            self.n = start
        def __call__(self):
            self.n += 1
        def __str__(self):
            return '{}'.format(self.n)
    
    counter = count(1)
    print(str(counter))
    counter()
    print(str(counter))
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo