Thread: word counter

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Posts
    4
    Rep Power
    0

    Exclamation word counter


    Hi everybody

    How can I count the number of words in a text file
    Also, how can I find the length of each sentence in the same text file ( assume that each sentence end with e.g.(".", "?", "!" ... etc), where the legth of the sentence is the number of words in it. Also, print each sentence separately.

    can you help me guys.


    ---dreamer300
  2. #2
  3. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,612
    Rep Power
    4247
    Welcome to the forum. Please read the sticky posts and familiarize yourself with forum rules. One of these rules states that, we don't normally help with homework unless you show some effort in solving the problem yourself.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  4. #3
  5. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154
    Hehe, good job catching that Scorpions4Ever. dreamer300, I hope you atleast know some Python. Read the file through Python, record each line, use len() to find out the length, and use find() to find specific characters.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    34
    How can I count the number of words in a text file
    Tools -> Word Count

    Code:
    user@host$ wc -w document.txt
    14
    Or:

    - Open a text file
    - Read the contents
    - Break the contents into words
    - there's room for more discussion here; words aren't always split by spaces, you see.
    - count the words.

    Python is rather good at splitting and searching text.

    I suggest you start with an empty text file, and put comments in for each step, then start finding Python code to do each step a bit at a time, such as:

    Code:
    # open a text file
    source = open("filename.txt", 'r')
    
    # read the contents and store them somewhere
    # data = something-or-other
    Also, how can I find the length of each sentence in the same text file ( assume that each sentence end with e.g.(".", "?", "!" ... etc)
    I can assume, but Python can't. How do you treat brackets and quotes?

    I prefer the form:

    (This is a sentence).
    "This is a quote".

    Where the brackets/speechmarks are part of the sentence, and the dot denotes the end of the sentence.

    However, it seems that other people prefer:

    (This is a sentence.)
    and
    "This is a sentence."

    Where the sentence being quoted is finished by the dot, and the quotes mean the whole sentence is being quoted, rather than that the quoted phrase is part of the sentence in the text.

    Of course, rarely you might find

    "This is a sentence.".

    Where someone pedantic wishes to indicate that the sentence being quoted is ending, and that the sentence consisting of the quote is ending. But if you just looked for . to mean the end of a sentence, then you would get ". as a sentence on its own...

    ... an elipsis would also mess around with that idea.

    As would people who ask questions with exclamations;

    "She did WHAT?!"

    or "But HOW?????"

    Anyway.

    - Define what your "end of sentence" characters are.
    - Group all of the text in the file into one lump
    - Split it up wherever there is an end-of-sentence character.

    Also, print each sentence separately.
    Hint:
    Code:
    >>> for item in ["list", "of", "items"]:
    >>>     print item
    list
    of
    items
    Yes, I'm beating around the bush...
  8. #5
  9. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154
    Very detailed tutorial sfb. I congratulate you.

    Comments on this post

    • jacktasia agrees
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Posts
    4
    Rep Power
    0
    Thank you †Yegg†, sfb and Scorpions4ever.

    Ok guys, I know some aspects in python, e.g. I wrote this :
    -------------------------------
    f = open ('me.txt' , 'r')

    a = f.readline()
    b = " "
    counter = 0

    while (a != " ") :
    for b in (a):
    counter = counter + 1
    break
    print counter
    f.close()
    ---------------------------------

    I want tell you that I tried.
    I know about (for and wile loop), if statement, how to open a file. However, the problem was how to combine this information to get the right answer.

    --- dreamer300
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Posts
    4
    Rep Power
    0
    I used this :

    -----------------------------
    f=open ('me.txt','r')

    a = f.read()

    print len( a.split() )
    -----------------------------
    It printed the number of words in my text file.
    But how can I split the text file into sentences also.
    could you please me? .


    ---dreamer300
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2004
    Posts
    10
    Rep Power
    0
    Try something like this:

    Code:
    article = file ( 'article.txt' ).read()
    article = article.replace ( '!', '.' ).replace ( '?', '.' )
    sentences = len ( article.split ( '.' ) ) - 1
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Posts
    4
    Rep Power
    0
    Thank you Peyton
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Posts
    84
    Rep Power
    11
    you could probably put some more effort into finding sentences if you were so inclined. the simple rules you have at the moment will find most sentences, but you don't cater for a few things.

    1) abbreviations, such as etc., use a period and don't necessarily end a sentence (and then when they do, still only one period is used). also, it is not uncommon for people to write things like 'etc...' that could create a few extra sentences than there really are.

    2) alot of the time, words on either side of characters such as ':', ';', and '-' should be considered as sentences as well.

    3) since you're not actually processing the content of the sentences, it doesn't matter that much but you're also missing out on sentences that end with quote marks like 'and he said "hello."' that most commonly end with '."'. you would be cutting off the quote and this could cause you some problems if you were to ever go on and process the content.

    just a few things I thought I should point out if you wanted to go above and beyond the call of duty finding the sentences in some text is by no means a clear cut and simple problem, as you'll always find people who don't like to obey the rules, or just plain don't know the rules.

IMN logo majestic logo threadwatch logo seochat tools logo