Discuss Newbie read .txt file question. in the Python Programming forum on Dev Shed. Newbie read .txt file question. Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
Newbie read .txt file question.
I would like to be able to pull out parts of a text file and out put as a csv file. Here's a sample of the text file below. There's some structure to the text file each record is defined by {}. Could someone please explain how I open the text file and then for example extract the field after name and the number after age. I've looked at parsing the file or using regex but with little knowledge am struggling to know which is the best way forward or how to implement. Thank you for any help in advance. (Using Python 3.)
Posts: 97
Time spent in forums: 1 Day 15 h 26 m
Reputation Power: 2
Quote:
Originally Posted by skyblues
I would like to be able to pull out parts of a text file and out put as a csv file. Here's a sample of the text file below. There's some structure to the text file each record is defined by {}. Could someone please explain how I open the text file and then for example extract the field after name and the number after age. I've looked at parsing the file or using regex but with little knowledge am struggling to know which is the best way forward or how to implement. Thank you for any help in advance. (Using Python 3.)
Let's say your file is called, "textfile.txt". To open it:
Code:
fid=open("textfile.txt")
Now "fid" is a file object associated with that file. That means it's also an iterator, so you can easily loop over the lines.
Code:
for line in fid:
The structure of the file you have posted is very much like a Python dictionary. However, you would have to use some dangerous code to just treat it as such, so let's not.
As you read each line, strip off the line feed and the "{" and the "}"
Code:
line=line.strip("\n{}")
Now you could do some fancy splitting and you could use regular expressions, but just to get 2 values out I don't think that's necessary. Take the "name" field. Find where "name": is:
Code:
indx=line.find("\"name\":")
Now the actual name is at indx plus the length of "name": or indx+7, and until
Code:
indx2=line.find(",",indx)
so
Code:
name=line[indx+7:indx2]
and similarly for age but you'll have to convert the string to a number (int(string)).
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
I now have the following code which works as intended. (Thank you.) How do I go to the next lines in the text file? I guess I need to setup some kind of loop? Thank you again for all the help.
Code:
fid=open("test.txt")
for line in fid:
line=line.strip("\n{}")
indx=line.find("\"name\":")
indx2=line.find(",",indx)
name=line[indx+8:indx2-1]
indx1=line.find("\"age\":")
indx3=line.find(",",indx1)
age=line[indx1+7:indx3-1]
print(name, age)
I merely added [ and ] around the data in the file
Next, here's the code I used to parse it:
Code:
#!/usr/bin/python
import json
fp = open('input.json', 'r')
json_obj = json.load(fp)
fp.close()
#import pprint
for record in json_obj:
print(record[u'name'] + " " + record[u'age'])
#pprint.pprint(record)
I've commented out the code that calls pprint, but you can uncomment it to see what the structure of each record is like.
What json.load() does is load up a file and convert it into a python object (in this case, an array of dictionary objects, each of which has other data bits).
Then we simply use a for loop to loop through the array and print out the values of specific dictionary keys. Note that the keys are in unicode (which is why they are specified as u'name' and u'age' instead of 'name' and 'age'. This is because JSON is supposed to work in unicode per the spec. To convert to ASCII keys, see http://stackoverflow.com/questions/...work-with-ascii for details)
The nice thing about this approach is that it is very clean and parses all 3 records correctly. Hope this helps.
__________________ Up the Irons
What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home. "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
Down with Sharon Osbourne
Last edited by Scorpions4ever : December 18th, 2012 at 01:59 PM.
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
Quote:
Originally Posted by skyblues
Thank you Scorpions for a different solution. I will look into this tomorrow.
Hi Scorpions,
I tried your code but I receive the following errors. Not sure what I've done wrong but guess it must be something to do with data?
Thank you for the help.
Code:
Traceback (most recent call last): File "C:/Python33/tester1.py", line 5, in <module> json_obj = json.load(fp) File "C:\Python33\lib\json\__init__.py", line 264, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "C:\Python33\lib\json\__init__.py", line 309, in loads return _default_decoder.decode(s) File "C:\Python33\lib\json\decoder.py", line 352, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Python33\lib\json\decoder.py", line 370, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
SuperOscar thank you for the reply. I double checked and I have included [ at the beginning and ] end of the text file. I must be doing something else wrong. As I explained in the opening post I'm new to Python so I could be easily making a silly error somewhere else. I will try your solution. Thank you all again for your patience and expert advice whilst dealing with a beginner.
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
SuperOscar I receive a similar error message when I try your code.
Does the the text file need to be saved in a certain format or am I barking up the wrong tree?
Here's the error code:
Traceback (most recent call last):
File "C:\Python33\tester1.py", line 5, in <module>
json_obj = json.loads(buff)
File "C:\Python33\lib\json\__init__.py", line 309, in loads
return _default_decoder.decode(s)
File "C:\Python33\lib\json\decoder.py", line 352, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python33\lib\json\decoder.py", line 370, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
I hope I'm not testing everyone's patience with these newbie errors. Thank you all.
Posts: 412
Time spent in forums: 1 Week 7 h 25 m 55 sec
Reputation Power: 65
Quote:
Originally Posted by skyblues
SuperOscar I receive a similar error message when I try your code.
Well, it’s hard to say from afar. I copied-and-pastied the data you provided in the first post in this thread into a file and used the code I provided, and the output was:
Code:
Mahmoud El-Shazly 32.00
Bobby Roberts 30.08
Józef Barna 20.10
So I guess the data in your file differs somehow from that you copied in here.
Posts: 12
Time spent in forums: 5 h 24 m 39 sec
Reputation Power: 0
Thank you for the reply SuperOscar. I've taken the data from my original post and still receive the same errors. If I save the data as ANSI file I do receive less errors.
Traceback (most recent call last):
File "C:\Python33\tester1.py", line 8, in <module>
print(record[u'name'] + " " + record[u'age'])
TypeError: list indices must be integers, not str
Would it be to do with the version of Python I'm using? (3.3.0)