January 21st, 2014, 03:59 PM
DNA Sequence Generator
I'm just starting to learn python in a bioinformatics research lab. My first project was to generate a program that can spit out various DNA sequences with parameters of length and number of copies. The sequences would then need to be output in FASTA format.
For those unfamiliar a DNA sequence can be made up of four "letters": A,G,C,T. Example DNA sequence: ACGTTCCGTACGTACTCT
I am really new to this all and I would like some advice on how to go about this and how to learn python in general (rely on tutorials, do random projects, etc).
I am currently using someone else's program for my DNA sequence project and then I will go through line by line to see what's being done.
The first error I encountered when copying over the code was this:
>>> import random
>>> import sys
>>> def simulate_sequence (length) :
dna = ['A','G','C','T']
sequence = ''
for i in range (length) :
sequence += random.choice (dna)
>>> setsize = int (sys.argv)
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
setsize = int (sys.argv)
IndexError: list index out of range
January 21st, 2014, 09:09 PM
sys.argv is a list of the command line arguments. Since you didn't supply any arguments when you started python the list has length 1, and sys.argv was the only item on the list.
Let's say I create a file named p.py containing a python program, then I start python from my command shell. Python evaluates the program statements and quits.
Here's the program
$ python3 p.py 13950000 base pairs
['p.py', '13950000', 'base', 'pairs']
Therefore, copy the program to a file and run it properly, which I'll guess goes like this:
# display sys.argv
$ python random_sequence.py LENGTH COPIES
[/code] are essential for python code and Makefiles!