### Thread: Need Help With Genome Path Program

1. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Aug 2013
Posts
232
Rep Power
2

#### Need Help With Genome Path Program

Hello,

So just for kicks, since I'm following a Bioinformatics course on Coursera.org, I decided to make a Genome Path program that will take in k-mers or 3 nucleotide codons of a made up genome, and compare the suffix of the first k-mer sent to the function and the prefix of the second k-mer sent to the function, and see if they match.

The idea is that if the suffix of the first k-mer matches the prefix of the second k-mer, then we can possibly find out the genome of a particular organism, since you're really moving 1 nucleotide at a time at the end of each kmer.

Here's my program, any idea on how to get the n+1 k-mer in the for loop on line 41?:

Code:
```# Create a path genome program that will analyze the suffix of the first
# k-mer and look to see if it matches the prefix of the second k-mer

# Make a function that will take in each k-mer and analyze their
# prefixes and suffixes:

def GenomePath(kmer1, kmer2):
# Find the suffix of k_mer1:
# Since they are three nucleotide codons, index into kmer_1 and
# find the '2nd' character aka in reality, its 3rd but you know how
# computers count!
suffixKmer1 = kmer1[2]
prefixKmer2 = kmer2[0]

if suffixKmer1 == prefixKmer2:
print("We have a match!")
print("Proceed to the next kmer!")

# Call the function you just made:
kmer1 = "GTCC"
kmer2 = "CTAG"

GenomePath(kmer1, kmer2)

MadeupGenome = ["GTCC", "CTAG", "GATC", "CATG"]

# Make a list of k-mers and loop with a for loop to call the function
# (Make the k-mers match arbitrarily to make sure the function works)
GenomePath(kmer,kmer+1)```
2. Unambiguously "first" and index 0 in index origin 0 mean the same.

Otherwise, perhaps you'd show input and expected output---in other words, test cases.
3. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Aug 2013
Posts
232
Rep Power
2
Yeah, after looking at the code, I realized I should have used i instead of k-mer, and also realized I gave 4 letter sequences when in reality, codons are in pairs of 3 nucleotides.

Anyway, here's my fixed code so far with the relevant case I guess at the bottom in a comment (though they should all test positive for this function, its just writing the for loop to iterate each one is the only issue for me so far):
Code:
```# Create a path genome program that will analyze the suffix of the first
# k-mer and look to see if it matches the prefix of the second k-mer

# Make a function that will take in each k-mer and analyze their
# prefixes and suffixes:

def GenomePath(kmer1, kmer2):
# Find the suffix of k_mer1:
# Since they are three nucleotide codons, index into kmer_1 and
# find the '2nd' character aka in reality, its 3rd but you know how
# computers count!

# Test to see if proper kmers were given to the function:
print("kmer1 =", kmer1)
print("kmer2 =", kmer2)
suffixKmer1 = kmer1[2]
prefixKmer2 = kmer2[0]

if suffixKmer1 == prefixKmer2:
print("We have a match!")
print("Proceed to the next kmer!")

# Call the function you just made:
kmer1 = "GTC"
kmer2 = "CTA"

GenomePath(kmer1, kmer2)

MadeupGenome = ["GTC", "CTG", "GAT", "TGC"]

# Make a list of k-mers and loop with a for loop to call the function
# (Make the k-mers match arbitrarily to make sure the function works)