#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    22
    Rep Power
    0

    How to merge characters with new characters insertion


    Hi all,

    I have a text files with huge number of contigs (or characters) such as (contig number does not reflect the order):

    >contig number 11
    tttgctcggaggggatc
    >contig number 23
    gaaaacacttccttattatacaggtaaaccgtatttggat
    >contig number 3
    aaagctcggaggggatcccct
    ...
    ..

    I want to concatenate the contigs such that the above order is preserved, and also, I want to insert the sequence "nnnnncattccattcattaattaattaatgaatgaatgnnnnn" in each contig boundaries (here are two contig boundaries), such that the final output file would become as follows:



    >concatenated contig
    tttgctcggaggggatcnnnnncattccattcattaattaattaatgaatgaatgnnnnngaaaacacttccttattatacaggtaaaccgtatttggat nnnnncattccattcattaattaattaatgaatgaatgnnnnnaaagctcggaggggatcccct



    Any help in solving the problem is highly appreciated. Thanks in advance..
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,710
    Rep Power
    480
    Code:
    the_answer = "nnnnncattccattcattaattaattaatgaatgaatgnnnnn".join(ListOfContigs)

    or did you also need help reading the file?
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    22
    Rep Power
    0
    plz give the complete code if you don't mind
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,710
    Rep Power
    480
    Code:
    with open('path_of_your_input_file_goes_here') as inf:
        result = "nnnnncattccattcattaattaattaatgaatgaatgnnnnn".join(sequence.strip() for sequence in inf if not sequence.startswith('>'))
    
    with open('path_of_OUTPUT', 'w') as ouf:
        ouf.write(result)

    Comments on this post

    • utpalmtbi agrees : Thanks
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo