Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Get inside! Sample the range of functionality easily built with JMSL Library for Time Series Data Analysis, Heat Maps, Portfolio Optimization, Monte Carlo Simulation, Stock Price Charting and more. Download Now!
  #1  
Old January 1st, 2001, 05:07 AM
ThaBomb ThaBomb is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2000
Location: Salem, OR, USA
Posts: 41 ThaBomb User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8

I am trying to filter out the data from a
wrapup data file that contains conversation
between a support staff and a customers. I then stuff those informations into a MySQL database. I know how to seperate the data if it is only one line, but how do you get data from multiple lines. For example (see below): The wrapup note contains multiple lines. How do I assign all those line to a single string variable? So, I can stuff the database with it. Do I screen for the line "******** Agent Notes **********" and make it a starting point, then screen for the line "*** Customer's Email Message ***" and make it an ending point. Take everything in between these two line and assign it to a variable?

Some sample code (reg. exp.) please!

Thanks a lot.

--DVN

SAMPLE DATA FILE:

********** Binding ************
Email Address: ThaBomb@weareit.com
Problem Type: Pricing/Promotion
Customer Type: Consumer

******** Agent Notes **********
Wrap-up Note:
this is email wrap up from conversation
between DVN and customer #12345. Customer
inquired on prices for image capture
software for Nikon microscope 12/20/00.

*** Customer's Email Message ***

Reply With Quote
  #2  
Old January 1st, 2001, 09:06 PM
vpopper's Avatar
vpopper vpopper is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2000
Location: Southern California
Posts: 73 vpopper User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 m 24 sec
Reputation Power: 9
Here's one possible solution:

<BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">code:</font><HR><pre>
# slurp in the whole file
undef $/;

open FILE, "<$file" or die "Cannot open $file: $!n";
$input = <FILE>;
close FILE;

foreach my $section ( split /**?s*(.*?)s***/, $input) {
chomp($section);
$section =~ s{^s+|s+$}{}g;
next unless $section =~ m{Agents+Notes};

($notes) = $section =~ m{Wrap-ups+Note.(.*)$};
$agent_notes{some_identifier} = $notes;
}
[/code]

*Note that I had to use a kludge for the "Wrap-up" regexp because this forum CGI turns part of it into a smiley, even if you request that smilies be disabled :-(


[This message has been edited by vpopper (edited January 01, 2001).]

Reply With Quote
  #3  
Old January 2nd, 2001, 12:42 AM
ThaBomb ThaBomb is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2000
Location: Salem, OR, USA
Posts: 41 ThaBomb User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8

Thank you very much for your respond. Looking at your code briefly, it looks like it will work. But I don't understand one line:
($notes) = $section =~ m/Wrap-ups+Note.(.*)$/;

($notes) <=== How does this work?

This is the first time I seen this usage.

--DN

Reply With Quote
  #4  
Old January 2nd, 2001, 02:46 PM
vpopper's Avatar
vpopper vpopper is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2000
Location: Southern California
Posts: 73 vpopper User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 m 24 sec
Reputation Power: 9
<BLOCKQUOTE><font size="1" face="Verdana,Arial,Helvetica">quote:</font><HR>Originally posted by ThaBomb:
I don't understand one line:
($notes) = $section =~ m/Wrap-ups+Note.(.*)$/;

($notes) <=== How does this work?
[/quote]

$notes will be assigned the value of the matching text in the parens of the regexp, i.e. everything after "Wrap-up Note:". You could think of it as $1.

Reply With Quote
  #5  
Old January 2nd, 2001, 10:05 PM
ThaBomb ThaBomb is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2000
Location: Salem, OR, USA
Posts: 41 ThaBomb User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8

Hello vpopper,

I finally had a chance to try your code snippet today and I couldn't get it to work. Am I doing something wrong?

I tried split /**?s*(.*?)s***/ but it didn't work. So I changed it to split /^*+s*(.*?)s**+$/ and it still didn't work.

Here is my code:

#!/usr/bin/perl

$/;

$datadir = "/home/httpd/html/wrap";
open(INFILE, "$datadir/sssss") | | die "Cannot open file: $!n";
$input = <INFILE>;
close(INFILE);

open(OUTFILE,"> $datadir/sample_output.txt") or die "Cannot open file: $!n";
foreach my $section ( split /^*+s*(.*?)s**+$/, $input)
{
chomp($section);
$section =~ s/^s+|s+$//g;
next unless $section =~ m/Agents+Notes/;
($notes) = $section =~ m/Wrap-ups+Note.(.*)$/;
$agent_notes{agent_notes} = $notes;

# Testing printing
print OUTFILE "=============================n";
print OUTFILE "Header ...... $section n";
print OUTFILE "$agent_notes{agent_notes} n";
print OUTFILE "==============================n";
}
close(OUTFILE);

Here is my data:

************* Agent Notes ******************
Wrap-up Note:
ok


********* Customer Transcript **************
Transcript:
Connect Wednesday, November 01, 2000 - 05:11:09 PM
Connected. Ready to assist customer Ijattsu.

Greeting message: Wednesday, November 01, 2000 - 05:11:09 PM
Greetings from blah blah blah

Reply With Quote
  #6  
Old January 3rd, 2001, 04:39 PM
vpopper's Avatar
vpopper vpopper is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2000
Location: Southern California
Posts: 73 vpopper User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 m 24 sec
Reputation Power: 9
[QUOTE]Originally posted by ThaBomb:
I finally had a chance to try your code snippet today and I couldn't get it to work. Am I doing something wrong?

This line:
$/;

Should be changed to this:
undef $/;

The $/ variable is the input record separator, defaulted to a newline character. If we undef it, we slurp in the whole file rather than one line at a time.

Since you didn't undef it, you only read one line. This line would cause it to be skipped:

next unless $section =~ m/Agents+Notes/;

Also, if you are going to output the section as it is read, you don't need to store it in a hash. You can just output $notes. But you'll also need some identifier for the notes, unless you are just printing them all without association.


Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Parsing Help


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway