#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    15
    Rep Power
    0

    Python is Reading My .txt File Incorrectly


    I have a file in a .txt format with just a bunch of words, in a list like layout. When I open and read the file in python, it gives me all the margins, columns and format as the words. Like this:

    word is {\rtf1\ansi\ansicpg1252\cocoartf1187\cocoasubrtf340
    word is {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
    word is {\colortbl;\red255\green255\blue255;}
    word is \margl1440\margr1440\vieww10800\viewh8400\viewkind0
    word is \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatur al
    word is

    it will also read out the words that are actually in the list with a "\" right after.

    I've tried .rtf and .doc also. The .rtf does the same thing and it won't even open the .doc file format.

    Any clues?
    Any help will be much appreciated,,,thanks
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    138
    Rep Power
    2
    Originally Posted by CastorTroy
    I have a file in a .txt format with just a bunch of words, in a list like layout. When I open and read the file in python, it gives me all the margins, columns and format as the words. Like this:

    word is {\rtf1\ansi\ansicpg1252\cocoartf1187\cocoasubrtf340
    word is {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
    word is {\colortbl;\red255\green255\blue255;}
    word is \margl1440\margr1440\vieww10800\viewh8400\viewkind0
    word is \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatur al
    word is

    it will also read out the words that are actually in the list with a "\" right after.

    I've tried .rtf and .doc also. The .rtf does the same thing and it won't even open the .doc file format.

    Any clues?
    Any help will be much appreciated,,,thanks
    Please post sample input data, your python code, and expected output data. With your explanation above it's impossible to help. Also state your operating system and python version.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    15
    Rep Power
    0
    Originally Posted by partoj
    Please post sample input data, your python code, and expected output data. With your explanation above it's impossible to help. Also state your operating system and python version.
    I'm using OS X ML and version 3.3. The code I wrote has a lot of flaws and needs to be completely redone. But I have a .txt file with over 100,000 words that I pulled off the internet so debugging takes a long time. That is why I created my own .txt file with about twenty words in it.

    Maybe you can point me in the right direction. I'm trying to take a list of words and find the pairs that are exactly the same spelled backwards and put them into a list of its own. I tried using recursion, but that does not work unless I am doing something wrong.

    Any ideas?

    I'm kind of new to this and teaching myself through books and online tutorials so the simpler the better for now.

    Thanks
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2009
    Posts
    478
    Rep Power
    33
    I've tried .rtf and .doc also. The .rtf does the same thing and it won't even open the .doc file format.
    You have to save it as a text file. The crap you are getting when you read the file (color, font, etc.) is formatting info for the type of file it is. A text file does not have any of this. Something like the following will create a text output file that can be used for testing, as long as you don't view and (auto)save it from something else. Note that you will still have to strip() the newline
    Code:
    words_list = ["cat", "dog", "horse", "goat", "parrot"]
    
    with open("./test_words.txt", "w") as output:
        for word in words_list:
            output.write("%s\n" % (word))
    Last edited by dwblas; March 7th, 2013 at 12:05 PM.
  8. #5
  9. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Location
    Tewksbury, MA
    Posts
    36
    Rep Power
    2
    You still need to post your input file or we can't tell you much. Obviously it's not just text since the output contains formating data. Whatever Mac text editor you're using to create the file must be adding it. Is there a simple "Notepad" style text editor for Mac you can try?
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,841
    Rep Power
    480
    If you can make a list of the words,
    Code:
    def palindrome(a):
        return list(a)==list(reversed(a))
    
    set((a for a in words if palindrome(a)))
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo