#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    1
    Rep Power
    0

    Phyton script for fasta file


    Hi everyone,


    Can someone please help me with the following phyton script? I received the error message DeprecationWarning: the sets module is deprecated
    from sets import Set.

    After googling, I have tried the methods others suggest: change sets to set or delete the from sets import Set but none of them works.

    Can someone suggest me how to modify the following codes so that the input file is read from standard input?
    I'd like to execute them with unix command

    script.py < sequence.fna


    Thanks a bunch.



    #!/usr/local/bin/python

    import math
    from sets import Set


    line = file("sequence.fna", "r")

    for x in line:
    if x [0] == ">" :

    #determine the length of sequences
    s=line.next()
    s=s.rstrip()
    length = len(s)

    # determine the GC content
    G = s.count('G')
    C = s.count('C')
    GC= 100 * (float(G + C) / length)


    stList = list(s)
    alphabet = list(Set(stList))

    freqList = []
    for symbol in alphabet:
    ctr = 0
    for sym in stList:
    if sym == symbol:
    ctr += 1
    freqList.append(float(ctr)/len(stList))

    # Shannon entropy
    ent = 0.0
    for freq in freqList:
    ent = ent + freq * math.log(freq, 2)
    ent = -ent

    print x
    print "Length:" , length
    print "G+C:" ,round(GC),"%"
    print 'Shannon entropy:'
    print ent
    print 'Minimum number of bits required to encode each symbol:'
    print int(math.ceil(ent))
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    138
    Rep Power
    2
    Originally Posted by miclow
    Hi everyone,


    Can someone please help me with the following phyton script? I received the error message DeprecationWarning: the sets module is deprecated
    from sets import Set.

    After googling, I have tried the methods others suggest: change sets to set or delete the from sets import Set but none of them works.
    Could you please elaborate on what "none of them works" means?

    Can someone suggest me how to modify the following codes so that the input file is read from standard input?
    I'd like to execute them with unix command

    script.py < sequence.fna
    Code:
    import fileinput
    
    for line in fileinput.input():
        for x in line:
            # do work here
  4. #3
  5. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,855
    Rep Power
    481
    #set is a builtin data type in pythons 2.7 and 3+.


    alphabet = list(set(stList))

    #alphabet is now the unique characters from stList
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo