SunQuest
           Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Try It Free
Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Get inside! Sample the range of functionality easily built with JMSL Library for Time Series Data Analysis, Heat Maps, Portfolio Optimization, Monte Carlo Simulation, Stock Price Charting and more. Download Now!
  #1  
Old March 5th, 2001, 05:24 PM
scream scream is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2000
Posts: 441 scream User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 22 h 59 m 38 sec
Reputation Power: 9
Send a message via ICQ to scream
Angry

Hi, I am having a little problem with LWP. This script checks links in a mysql database. Everything works ok, but as I was testing I noticed an issue. When it checks a url that no longer exists, but the server it formerly resided on has custom 404 error pages, the link checks as being valid. I guess what I want to do is only consider the url valid if it returns a 200. How can I modify the script to work like this?

#!/usr/bin/perl

use LWP::UserAgent;
use DBI;

$db_database = "db";
$db_uid = "user";
$db_pwd = "pass";
($ua = LWP::UserAgent->new)->timeout(20); #actually set timeout


$dbh = DBI->connect ("DBI:mysql:$db_database".$mysqlsock, $db_uid, $db_pwd) or die("could not connect to db\n");
$sth = $dbh->prepare("SELECT url FROM files");
$sth -> execute();
$numrows = $sth->rows;

$i = 0;
$works = 0;
$notworks = 0;
print "\n\n";
#while (my $url = $sth->fetchrow_array) {
while (defined(my $url = $sth->fetchrow_array)) {
if(($ua->request(HTTP::Request->new('HEAD', $url)))->is_success()) {
$validity = "link works";
$valid_update = $dbh->do("UPDATE files SET valid = 1 WHERE url = '$url'");
++$works
}
else {
$validity = "link sucks";
$valid_update = $dbh->do("UPDATE files SET valid = valid + 1 WHERE url = '$url'");
++$notworks;
}
++$i;
print "$i of $numrows\n$validity\n$url\n\n";
}
$sth->finish;

[Edited by scream on 03-05-2001 at 04:27 PM]

Reply With Quote
  #2  
Old March 5th, 2001, 08:37 PM
dsb dsb is offline
PerlGuy
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jan 2001
Posts: 714 dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Days 15 h 44 m 20 sec
Reputation Power: 36
Send a message via AIM to dsb
Wink

Scream,
One thing you could do is, rather than use LWP::UserAgent, look into using HTTP::Status. That extension has a bunch of functions that can be used to check the status line of the HTTP Headers for error codes such as 404 Errors. Check out the docs to see some examples.

I know of this module from a system that was in place at my old job that was used to check links. I've never actually used so I'm not sure what else to tell you. I would assume though, that it would be used in conjunction with HTTP::Request and HTTP::Response to get the URL. Using these modules includes the use of LWP::UserAgent in many instances so you may to check things out there.

This is a direction to go in. To get more information on these extensions do two things.
1. Read the Perldocs for all of them.
2. Ask your question at Perlmonks. That is a great source of Perl information.

Hope that helps.
__________________
- dsb -
Perl Guy

Reply With Quote
  #3  
Old March 6th, 2001, 12:10 PM
scream scream is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2000
Posts: 441 scream User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 22 h 59 m 38 sec
Reputation Power: 9
Send a message via ICQ to scream
Thanks, I'll look into HTTP::Request and HTTP::Response. I also posted on Perl Monks. That's a great site!

I appreciate the help dsb, and am open to anything someone else may have to offer.

Regards, Ryan

Reply With Quote
  #4  
Old March 6th, 2001, 02:56 PM
scream scream is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2000
Posts: 441 scream User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 22 h 59 m 38 sec
Reputation Power: 9
Send a message via ICQ to scream
I found a solution. I only need to change the conditions of my if/else test:

if(($ua->request(HTTP::Request->new('HEAD', $url)))->code() == 200) {

Reply With Quote
  #5  
Old March 7th, 2001, 09:03 AM
dsb dsb is offline
PerlGuy
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jan 2001
Posts: 714 dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level)dsb User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Days 15 h 44 m 20 sec
Reputation Power: 36
Send a message via AIM to dsb
Wink

Yeah, I saw you're question over at Perlmonks. That is a much better solution. I learned something too.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > LWP Help


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway