#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2017
    Posts
    5
    Rep Power
    0

    Finding the term frequency in a list (new to programming)


    Hi guys

    I’m making a ‘term frequency’ program which counts the strings in a list and then gives the number of times the strings have been repeated, for example…

    [‘a’,‘horse’,‘a’,‘fast’,‘horse’,‘jumps’,‘over’,‘the’,‘smart’,‘duck’]

    ({‘a’: 2, ‘horse’: 2, ‘over’: 1, ‘fast’: 1, ‘duck’: 1, ‘the’: 1, ‘jumps’: 1, ‘smart’: 1})

    Here is the code I need to build a function for, the Bag() is an ADT to help the program run.

    Code:
    from Bag import *
    terms = ['the','fox','the','quick','fox','jumps','over','the','lazy','dog']
    document = Bag()
    for term in terms:
        document.add(term)
    def tf(term, document):
        
    print(tf('the', document))
    This is what I have as the body at the moment, but I can only get it to count the number of string… How do I divide the string counts by the length of the list to get term frequency?

    Code:
    from collections import Counter
    terms=['the','fox','the','quick','fox','jumps','over','the','lazy','dog']
    counts = Counter(terms)
    print(counts)
    Counter({})
    Any help is appreciated
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2017
    Posts
    5
    Rep Power
    0
    I forgot to add that the result once applied should look like...

    print(tf('the', document)) 0.3

    Any help & ideas welcome

    Thanks again!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2017
    Posts
    5
    Rep Power
    0
    Please can someone delete this post for me as its not needed anymore. Thankyou
  6. #4
  7. Contributing User
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    Aug 2011
    Posts
    5,898
    Rep Power
    509
    This division problem depends on use of python 2 or python3. In either case,
    the_integral_count / float(len(LIST))
    will work.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo