|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
Get inside! Sample the range of functionality easily built with JMSL Library for Time Series Data Analysis, Heat Maps, Portfolio Optimization, Monte Carlo Simulation, Stock Price Charting and more. Download Now! |
|
#1
|
|||
|
|||
|
Hi, I am having a little problem with LWP. This script checks links in a mysql database. Everything works ok, but as I was testing I noticed an issue. When it checks a url that no longer exists, but the server it formerly resided on has custom 404 error pages, the link checks as being valid. I guess what I want to do is only consider the url valid if it returns a 200. How can I modify the script to work like this?
#!/usr/bin/perl use LWP::UserAgent; use DBI; $db_database = "db"; $db_uid = "user"; $db_pwd = "pass"; ($ua = LWP::UserAgent->new)->timeout(20); #actually set timeout $dbh = DBI->connect ("DBI:mysql:$db_database".$mysqlsock, $db_uid, $db_pwd) or die("could not connect to db\n"); $sth = $dbh->prepare("SELECT url FROM files"); $sth -> execute(); $numrows = $sth->rows; $i = 0; $works = 0; $notworks = 0; print "\n\n"; #while (my $url = $sth->fetchrow_array) { while (defined(my $url = $sth->fetchrow_array)) { if(($ua->request(HTTP::Request->new('HEAD', $url)))->is_success()) { $validity = "link works"; $valid_update = $dbh->do("UPDATE files SET valid = 1 WHERE url = '$url'"); ++$works } else { $validity = "link sucks"; $valid_update = $dbh->do("UPDATE files SET valid = valid + 1 WHERE url = '$url'"); ++$notworks; } ++$i; print "$i of $numrows\n$validity\n$url\n\n"; } $sth->finish; [Edited by scream on 03-05-2001 at 04:27 PM] |
|
#2
|
|||
|
|||
|
Scream,
One thing you could do is, rather than use LWP::UserAgent, look into using HTTP::Status. That extension has a bunch of functions that can be used to check the status line of the HTTP Headers for error codes such as 404 Errors. Check out the docs to see some examples. I know of this module from a system that was in place at my old job that was used to check links. I've never actually used so I'm not sure what else to tell you. I would assume though, that it would be used in conjunction with HTTP::Request and HTTP::Response to get the URL. Using these modules includes the use of LWP::UserAgent in many instances so you may to check things out there. This is a direction to go in. To get more information on these extensions do two things. 1. Read the Perldocs for all of them. 2. Ask your question at Perlmonks. That is a great source of Perl information. Hope that helps.
__________________
- dsb - ![]() Perl Guy |
|
#3
|
|||
|
|||
|
Thanks, I'll look into HTTP::Request and HTTP::Response. I also posted on Perl Monks. That's a great site!
I appreciate the help dsb, and am open to anything someone else may have to offer. Regards, Ryan |
|
#4
|
|||
|
|||
|
I found a solution. I only need to change the conditions of my if/else test:
if(($ua->request(HTTP::Request->new('HEAD', $url)))->code() == 200) { |
|
#5
|
|||
|
|||
|
Yeah, I saw you're question over at Perlmonks. That is a much better solution. I learned something too.
![]() |
![]() |
| Viewing: Dev Shed Forums > Programming Languages > Perl Programming > LWP Help |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|