|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| ||||||||||||||||||||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Pattern-matching an html document
I want to grab the HTML from a web page and insert certain lines into an array.
Code:
#!/usr/bin/perl -w
use LWP::Simple;
$html2009 = get('http://www.ustreas.gov/offices/domestic-finance/debt-management/interest-rate/yield_historical.shtml');
@dates2009 = ();
#scrape dates#
while ($line = $html2009){
if ($line =~ m/^"<td class=\"smaller\" headers=\"1\">\d\d\/\d\d\/\d\d<\/td>"/i){
push (@dates2009, $line);
}
}
print @dates2009;
It occurred to me that I'm not really assigning a single line to $line, but the whole document. How do I read in one line at a time, or is there a better way to extract the data I'm looking for? |
|
#2
|
||||
|
||||
|
@lines=split(/\n/, $html2009);
foreach (@lines) { ... but looking at your pattern match, ^"<td>you want the start of the line to have [/u]"<td>[/u], is that likely You may want to look into HTML::Parser on CPAN, there's a good few of them over there
__________________
--Ax without exception, there is no rule ... Heavy Haulage Ireland Targeted Advertising Cookie Optout (TACO) extension for Firefox The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones ![]() 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. -- Jamie Zawinski Detavil - the devil is in the detail, allegedly, and I use the term advisedly, allegedly ... oh, no, wait I did ... Last edited by Axweildr : June 25th, 2009 at 02:30 PM. |
![]() |
| Viewing: Dev Shed Forums > Programming Languages > Perl Programming > Pattern-matching an html document |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|
|