The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
urllib2.urlopen() REVISITED
Discuss urllib2.urlopen() REVISITED in the Python Programming forum on Dev Shed. urllib2.urlopen() REVISITED Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

June 18th, 2004, 01:27 PM
|
|
Contributing User
|
|
Join Date: May 2001
Posts: 266
Time spent in forums: 30 m 33 sec
Reputation Power: 13
|
|
|
urllib2.urlopen() REVISITED
I am using python to check whether certain urls exist.
The list would be something like:
http://aaa-6020-sql-425h.qa.lan/php/admin/launch.php
http://aaa-6020-oci-425h.qa.lan/php/admin/launch.php
Most urls are opened without a problem and the code works as it should. But a few urls on a specific server make the script hang. I am pretty sure this is a server issue because when I run wget from the command line it hangs as well. What I would like is a timeout feature but as far as I can tell urllib2 doesn't have one. Is there a way to create my own timeout feature for urllib2.urlopen(url)?
Code:
try:
url = "http://aaa-6020-oci-425h.qa.lan/php/admin/launch.php"
urllib2.urlopen(url)
site_success_flag = True
except:
site_failure_flag = True
Also I forgot to mention I the server is running python 2.2.2 so I do not have access to socket timeout functions.
Last edited by Theeggman : June 18th, 2004 at 03:23 PM.
|

June 19th, 2004, 04:43 AM
|
 |
Hello World :)
|
|
Join Date: Mar 2003
Location: Hull, UK
|
|
|
Your best bet would probably to look though the urllib module, especially the URLopener() class and see how that works. Unfortunatly since these are both based on sockets if there is no way to set a socket timeout then you might have a problem.
Sorry i couldn't be more help,
Mark.
__________________
programming language development: www.netytan.com – Hula
|

June 19th, 2004, 06:28 AM
|
|
Contributing User
|
|
Join Date: Feb 2004
Location: London, England
|
|
|
If you can get hold of the socket object you could try using the socket.setsockopt function to set the timeout. This calls the low-level C function of the same name, so you will need to read the C docs for the function. You want to set the SO_RCVTIMEO and SO_SNDTIMEO options. These should be available on both UNIX and Windows with WinSock 2.0, but I suspect the parameters will be different.
You could also spawn multiple threads to check serveral servers in parallel. This will stop the whole program hanging if one server is down, and will be much faster overall.
Dave - The Developers' Coach
Last edited by DevCoach : June 19th, 2004 at 08:04 AM.
|

June 19th, 2004, 06:55 AM
|
 |
Hello World :)
|
|
Join Date: Mar 2003
Location: Hull, UK
|
|
Doh, never thought of that  . I like the threaded idea, wonder if you could run some kind of timer and force the thread to close the timer reaches a certain level? Not sure about the speed increase though - i've read about a lot of cases where the programmer introduced threads to boost the speed of his program which ended up running at the same speed. This may be tied to that particular case though?
Mark.
|

June 19th, 2004, 08:21 AM
|
|
Contributing User
|
|
Join Date: Feb 2004
Location: London, England
|
|
Quote: | Originally Posted by netytan Doh, never thought of that  . I like the threaded idea, wonder if you could run some kind of timer and force the thread to close the timer reaches a certain level? Not sure about the speed increase though - i've read about a lot of cases where the programmer introduced threads to boost the speed of his program which ended up running at the same speed. This may be tied to that particular case though?
Mark. |
I think the speed increase will be real. If an application is doing a lot of processing then splitting it into threads will not speed it up - the computer still has to do the same amount of work. In this case, however, the flow of the code goes something like this...
send a request to the server...
... wait for a response...
... keep waiting....
... etc...
... get a response and process it (or time out)
so probably 99% of the time the program is doing nothing except waiting for a response. It could just as easily be waiting for 20 responses as for 1.
Unfortunately there is no way of killing a thread in Python, which is a shame. The usual solution would be to have the thread polling for an exit flag, but it can't do that in this case since it is blocked on the socket call. You can do it if the socket is called with 'select', but since we are calling it through the urllib2 library we do not have that sort of low-level control of the socket. I have not looked at the urllib2 code, but it might be possible to subclass it to do this, although IMHO would be more trouble than it was worth.
Dave - The Developers' Coach
|

June 20th, 2004, 09:17 PM
|
|
Contributing User
|
|
Join Date: May 2001
Posts: 266
Time spent in forums: 30 m 33 sec
Reputation Power: 13
|
|
I appreciate all the feedback. I tried the thread idea and it would still hang. I was not able to kill the thread, at least that is what I think was happening.
But I did find a solution using the signal module. I am able to set an alarm before each try catch statement and if the alarm goes off the url does not exists else the url exists. It seem sort of sloppy but it works well.
Code:
import signal
import urllib2
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError, "Couldn't open device!"
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(4)
# This open() may hang indefinitely
try:
urllib2.urlopen(url)
print "exists"
except:
print "does not exist"
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|