March 16th, 2013, 05:10 AM
File header extraction
I have a multifasta file (input.fasta) with many sequences as follows:
and in another file (list.txt), I lists the wanted header line as:
the output file (output.txt) should contain the sequences of only those headers which are in list file. For the above list, the output file would be:
Help would be much appreciated. Thank you..
March 16th, 2013, 01:24 PM
with open('list.txt') as inf:
keys = set([L.strip() for L in inf]) # discard extraneous spacing and new line
with open('input.fasta') as inf:
K = inf.readline()
assert K == '>'
key = K[1:].strip() # discard extraneous spacing and new line
V = inf.readline()
if key in keys:
Comments on this post
[/code] are essential for python code and Makefiles!