Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old November 19th, 2012, 11:51 AM
djonesyyz djonesyyz is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2012
Posts: 4 djonesyyz User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 16 sec
Reputation Power: 0
How to filter csv.reader data?

I'm a little lost in how to modify the content loaded into the csv.reader. Any help be appreciated..

Thanks..

Code:

with open(infile, 'rb') as f:
output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
out_csv = csv.writer(open('C:\BHV\BHV_output.csv', 'ab'))


out_csv.writerows(output)


# for row in output:
# for col in row:
# row = row.replace("notused","")

# out_csv.writerow(new_row)

# out_csv.writerows(replace(output,'notused','')
# output = output[:-3]


I would like suggestions to modify the output from:

inotusedInbound,1350324983,,0054,6629,anonymous,0,1350324983.35,1350325007.758,Success,English,IVRH UP,10,NA,Y,Y

~notusedPrompt-and-Collect,1350325024,,0058,6629,Gather Customer Information,1350325025.419,1350325026.834,DTMF,2,Valid,100,1,NA,Y,N


To look like this: (Remove blank line and remove characters from the first column)

Inbound,1350324983,,0054,6629,anonymous,0,1350324983.35,1350325007.758,Success,English,IVRHUP,10,NA, Y,Y
Prompt-and-Collect,1350325024,,0058,6629,Gather Customer Information,1350325025.419,1350325026.834,DTMF,2,Valid,100,1,NA,Y,N


What seemed simple just left me digging for the correct funcion calls to modify the data before output to file..

Thanks,
Dave.

Reply With Quote
  #2  
Old November 19th, 2012, 12:35 PM
SuperOscar SuperOscar is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2007
Location: Joensuu, Finland
Posts: 404 SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 5 h 18 m 4 sec
Reputation Power: 65
Adding CODE tags...

Quote:
Originally Posted by djonesyyz
Code:
with open(infile, 'rb') as f:
        output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')


(I find it a trifle strange that you are READING a file named OUTPUT!)

To ignore blank lines you might check the length of the list read:

Code:
    for row in in_csv:
        if len(row) > some_margin_value:
            # now do something


I wasn’t sure of your other question. Does the “notused” always appear in the first column only? Is it always prepended with a single char that should be ignored as well? If so, do:

Code:
row[0] = row[0][1:].replace('notused', '')


where “[1:]” slices out the first character and .replace() replaces the given literal string with an empty string.
__________________
My armada: openSUSE 12.3 (home desktop, laptop, work desktop), Ubuntu 12.04 LTS (mini laptop), Debian GNU/Linux 7.0 (server), Mythbuntu 12.04 LTS (HTPC), Bodhi Linux 2.0 & Windows 7 Ultimate (test desktop), FreeBSD 9.1 (test server)

Reply With Quote
  #3  
Old November 19th, 2012, 01:31 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,393 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 15 h 37 m 10 sec
Reputation Power: 383
I also understood your questions poorly. You've created an interesting structure of nested generators. I created an intermediate step because you didn't show any of the input. SuperOscar's alternatives to ignore blank (or short) lines could well be more useful than my regular expression, as could the proposed "remove bad fields" algorithm.
Code:
# untested.
# probably has unicode str bytes confusion.
# I didn't keep track of which lines might end in a new line.

import re
import io
import csv

isBlank = re.compile(u'^[ \t]*$').match
bad = u'notused'

with open(infile, 'rb') as f:
    output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')

with io.StringIO() as buffer:
    with csv.writer(buffer) as middle_man:
        middle_man.writerows(output)

with open('C:\BHV\BHV_output.csv','ab') as out_csv:
    for LINE in middle_man:
        if isBlank(LINE):
            continue
        if LINE.startswith(bad): # or notused~ or whatever
            LINE = LINE[len(bad):]
        out_csv.write(LINE)
__________________
[code]Code tags[/code] are essential for python code!

Reply With Quote
  #4  
Old November 19th, 2012, 01:48 PM
djonesyyz djonesyyz is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2012
Posts: 4 djonesyyz User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 16 sec
Reputation Power: 0
Quote:
Originally Posted by b49P23TIvg
I also understood your questions poorly. You've created an interesting structure of nested generators. I created an intermediate step because you didn't show any of the input. SuperOscar's alternatives to ignore blank (or short) lines could well be more useful than my regular expression, as could the proposed "remove bad fields" algorithm.
Code:
# untested.
# probably has unicode str bytes confusion.
# I didn't keep track of which lines might end in a new line.

import re
import io
import csv

isBlank = re.compile(u'^[ \t]*$').match
bad = u'notused'

with open(infile, 'rb') as f:
    output = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')

with io.StringIO() as buffer:
    with csv.writer(buffer) as middle_man:
        middle_man.writerows(output)

with open('C:\BHV\BHV_output.csv','ab') as out_csv:
    for LINE in middle_man:
        if isBlank(LINE):
            continue
        if LINE.startswith(bad): # or notused~ or whatever
            LINE = LINE[len(bad):]
        out_csv.write(LINE)


Creative piece of code! I tried it out and got the following message:

with csv.writer(buffer) as middle_man:
AttributeError: __exit__

Reply With Quote
  #5  
Old November 19th, 2012, 01:58 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,393 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 15 h 37 m 10 sec
Reputation Power: 383
I'm afraid I've presented an invalid mix of python2 and python3.

Reply With Quote
  #6  
Old November 19th, 2012, 04:34 PM
djonesyyz djonesyyz is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2012
Posts: 4 djonesyyz User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 16 sec
Reputation Power: 0
Ok.. I'm now a little closer.. Having a problem with the .csv.writerows not writing any data back out. The print statements show the data:

wnotusedPrompt-and-Collect
wnotusedPrompt-and-Collect
Prompt-and-Collect
lnotusedInbound
lnotusedInbound
Inbound

So successfully stripped out the leading data chars and notused string.. But the data file written contains no data.. ugg...

Code:
with open(filename, 'rb') as f:
#   reader = csv.reader(f, delimiter='|', quoting=csv.QUOTE_NONE)
    file_input = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
    for row in file_input:
        print row[0]
        print row[0][1:]
        row[0] = row[0][2:].replace('notused', '')
        print row[0]
out_csv = csv.writer(open('C:\Users\dajones\workspace\BHV_T1\Read1\BHV_output.csv', 'ab'))
out_csv.writerows(file_input) 

Reply With Quote
  #7  
Old November 20th, 2012, 06:20 AM
SuperOscar SuperOscar is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2007
Location: Joensuu, Finland
Posts: 404 SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level)SuperOscar User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 5 h 18 m 4 sec
Reputation Power: 65
Quote:
Originally Posted by djonesyyz
So successfully stripped out the leading data chars and notused string.. But the data file written contains no data.. ugg...


Well, of course it doesn’t. You’ve already exhausted the input in the “for” loop above, creating and printing and then discarding (since after printing you don’t actually do anything with the row!) each row in its turn. Then you ask csv.writer to save to a file anything that csv.reader gives, although you’ve already encountered an EOF there.

Add CSV write commands inside the reading loop and write each row immediately after changing it.

Reply With Quote
  #8  
Old November 20th, 2012, 12:45 PM
djonesyyz djonesyyz is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2012
Posts: 4 djonesyyz User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 16 sec
Reputation Power: 0
That explains it.. Lesson learned about exhausting the input. I got a little stuck because I was using .writerows vs. .writerow... Thanks.. for the help..

Code:
import re, io, csv
#filename=raw_input('Please enter BHV log file:')
#file_output=raw_input('Please enter output file name:') 
filename = 'BHV151110-00060-'
file_output= 'BHV_output'
out_csv = csv.writer(open('C:\Users\dajones\workspace\BHV_T1\Read1\BHV_output.csv', 'ab'))


with open(filename, 'rb') as f:
    file_input = csv.reader( (line.replace('\0','') for line in f) , delimiter='|',quotechar = '"')
    for row in file_input:
        row[0] = row[0][2:].replace('notused', '')
        out_csv.writerow(row) 

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > How to filter csv.reader data?

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap