The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
How to filter csv.reader data?
Discuss How to filter csv.reader data? in the Python Programming forum on Dev Shed. How to filter csv.reader data? Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

November 19th, 2012, 11:51 AM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 54 m 16 sec
Reputation Power: 0
|
|
|
How to filter csv.reader data?
I'm a little lost in how to modify the content loaded into the csv.reader. Any help be appreciated..
Thanks..
Code:
with open(infile, 'rb') as f:
output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
out_csv = csv.writer(open('C:\BHV\BHV_output.csv', 'ab'))
out_csv.writerows(output)
# for row in output:
# for col in row:
# row = row.replace("notused","")
# out_csv.writerow(new_row)
# out_csv.writerows(replace(output,'notused','')
# output = output[:-3]
I would like suggestions to modify the output from:
inotusedInbound,1350324983,,0054,6629,anonymous,0,1350324983.35,1350325007.758,Success,English,IVRH UP,10,NA,Y,Y
~notusedPrompt-and-Collect,1350325024,,0058,6629,Gather Customer Information,1350325025.419,1350325026.834,DTMF,2,Valid,100,1,NA,Y,N
To look like this: (Remove blank line and remove characters from the first column)
Inbound,1350324983,,0054,6629,anonymous,0,1350324983.35,1350325007.758,Success,English,IVRHUP,10,NA, Y,Y
Prompt-and-Collect,1350325024,,0058,6629,Gather Customer Information,1350325025.419,1350325026.834,DTMF,2,Valid,100,1,NA,Y,N
What seemed simple just left me digging for the correct funcion calls to modify the data before output to file..
Thanks,
Dave.
|

November 19th, 2012, 12:35 PM
|
|
Contributing User
|
|
Join Date: Jul 2007
Location: Joensuu, Finland
|
|
Adding CODE tags...
Quote: | Originally Posted by djonesyyz
Code:
with open(infile, 'rb') as f:
output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
|
(I find it a trifle strange that you are READING a file named OUTPUT!)
To ignore blank lines you might check the length of the list read:
Code:
for row in in_csv:
if len(row) > some_margin_value:
# now do something
I wasn’t sure of your other question. Does the “notused” always appear in the first column only? Is it always prepended with a single char that should be ignored as well? If so, do:
Code:
row[0] = row[0][1:].replace('notused', '')
where “[1:]” slices out the first character and .replace() replaces the given literal string with an empty string.
__________________
My armada: openSUSE 12.3 (home desktop, laptop, work desktop), Ubuntu 12.04 LTS (mini laptop), Debian GNU/Linux 7.0 (server), Mythbuntu 12.04 LTS (HTPC), Bodhi Linux 2.0 & Windows 7 Ultimate (test desktop), FreeBSD 9.1 (test server)
|

November 19th, 2012, 01:31 PM
|
 |
Contributing User
|
|
|
|
I also understood your questions poorly. You've created an interesting structure of nested generators. I created an intermediate step because you didn't show any of the input. SuperOscar's alternatives to ignore blank (or short) lines could well be more useful than my regular expression, as could the proposed "remove bad fields" algorithm.
Code:
# untested.
# probably has unicode str bytes confusion.
# I didn't keep track of which lines might end in a new line.
import re
import io
import csv
isBlank = re.compile(u'^[ \t]*$').match
bad = u'notused'
with open(infile, 'rb') as f:
output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
with io.StringIO() as buffer:
with csv.writer(buffer) as middle_man:
middle_man.writerows(output)
with open('C:\BHV\BHV_output.csv','ab') as out_csv:
for LINE in middle_man:
if isBlank(LINE):
continue
if LINE.startswith(bad): # or notused~ or whatever
LINE = LINE[len(bad):]
out_csv.write(LINE)
__________________
[code] Code tags[/code] are essential for python code!
|

November 19th, 2012, 01:48 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 54 m 16 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by b49P23TIvg I also understood your questions poorly. You've created an interesting structure of nested generators. I created an intermediate step because you didn't show any of the input. SuperOscar's alternatives to ignore blank (or short) lines could well be more useful than my regular expression, as could the proposed "remove bad fields" algorithm.
Code:
# untested.
# probably has unicode str bytes confusion.
# I didn't keep track of which lines might end in a new line.
import re
import io
import csv
isBlank = re.compile(u'^[ \t]*$').match
bad = u'notused'
with open(infile, 'rb') as f:
output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
with io.StringIO() as buffer:
with csv.writer(buffer) as middle_man:
middle_man.writerows(output)
with open('C:\BHV\BHV_output.csv','ab') as out_csv:
for LINE in middle_man:
if isBlank(LINE):
continue
if LINE.startswith(bad): # or notused~ or whatever
LINE = LINE[len(bad):]
out_csv.write(LINE)
|
Creative piece of code! I tried it out and got the following message:
with csv.writer(buffer) as middle_man:
AttributeError: __exit__
|

November 19th, 2012, 01:58 PM
|
 |
Contributing User
|
|
|
|
|
I'm afraid I've presented an invalid mix of python2 and python3.
|

November 19th, 2012, 04:34 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 54 m 16 sec
Reputation Power: 0
|
|
Ok.. I'm now a little closer.. Having a problem with the .csv.writerows not writing any data back out. The print statements show the data:
wnotusedPrompt-and-Collect
wnotusedPrompt-and-Collect
Prompt-and-Collect
lnotusedInbound
lnotusedInbound
Inbound
So successfully stripped out the leading data chars and notused string.. But the data file written contains no data.. ugg...
Code:
with open(filename, 'rb') as f:
# reader = csv.reader(f, delimiter='|', quoting=csv.QUOTE_NONE)
file_input = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
for row in file_input:
print row[0]
print row[0][1:]
row[0] = row[0][2:].replace('notused', '')
print row[0]
out_csv = csv.writer(open('C:\Users\dajones\workspace\BHV_T1\Read1\BHV_output.csv', 'ab'))
out_csv.writerows(file_input)
|

November 20th, 2012, 06:20 AM
|
|
Contributing User
|
|
Join Date: Jul 2007
Location: Joensuu, Finland
|
|
Quote: | Originally Posted by djonesyyz So successfully stripped out the leading data chars and notused string.. But the data file written contains no data.. ugg... |
Well, of course it doesn’t. You’ve already exhausted the input in the “for” loop above, creating and printing and then discarding (since after printing you don’t actually do anything with the row!) each row in its turn. Then you ask csv.writer to save to a file anything that csv.reader gives, although you’ve already encountered an EOF there.
Add CSV write commands inside the reading loop and write each row immediately after changing it.
|

November 20th, 2012, 12:45 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 4
Time spent in forums: 54 m 16 sec
Reputation Power: 0
|
|
That explains it.. Lesson learned about exhausting the input. I got a little stuck because I was using .writerows vs. .writerow... Thanks.. for the help..
Code:
import re, io, csv
#filename=raw_input('Please enter BHV log file:')
#file_output=raw_input('Please enter output file name:')
filename = 'BHV151110-00060-'
file_output= 'BHV_output'
out_csv = csv.writer(open('C:\Users\dajones\workspace\BHV_T1\Read1\BHV_output.csv', 'ab'))
with open(filename, 'rb') as f:
file_input = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
for row in file_input:
row[0] = row[0][2:].replace('notused', '')
out_csv.writerow(row)
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|