June 18th, 2004, 02:27 PM
I am using python to check whether certain urls exist.
The list would be something like:
Most urls are opened without a problem and the code works as it should. But a few urls on a specific server make the script hang. I am pretty sure this is a server issue because when I run wget from the command line it hangs as well. What I would like is a timeout feature but as far as I can tell urllib2 doesn't have one. Is there a way to create my own timeout feature for urllib2.urlopen(url)?
Also I forgot to mention I the server is running python 2.2.2 so I do not have access to socket timeout functions.
url = "http://aaa-6020-oci-425h.qa.lan/php/admin/launch.php"
site_success_flag = True
site_failure_flag = True
Last edited by Theeggman; June 18th, 2004 at 04:23 PM.
June 19th, 2004, 05:43 AM
Your best bet would probably to look though the urllib module, especially the URLopener() class and see how that works. Unfortunatly since these are both based on sockets if there is no way to set a socket timeout then you might have a problem.
Sorry i couldn't be more help,
June 19th, 2004, 07:28 AM
If you can get hold of the socket object you could try using the socket.setsockopt function to set the timeout. This calls the low-level C function of the same name, so you will need to read the C docs for the function. You want to set the SO_RCVTIMEO and SO_SNDTIMEO options. These should be available on both UNIX and Windows with WinSock 2.0, but I suspect the parameters will be different.
You could also spawn multiple threads to check serveral servers in parallel. This will stop the whole program hanging if one server is down, and will be much faster overall.
Dave - The Developers' Coach
Last edited by DevCoach; June 19th, 2004 at 09:04 AM.
June 19th, 2004, 07:55 AM
Doh, never thought of that . I like the threaded idea, wonder if you could run some kind of timer and force the thread to close the timer reaches a certain level? Not sure about the speed increase though - i've read about a lot of cases where the programmer introduced threads to boost the speed of his program which ended up running at the same speed. This may be tied to that particular case though?
June 19th, 2004, 09:21 AM
Originally Posted by netytan
I think the speed increase will be real. If an application is doing a lot of processing then splitting it into threads will not speed it up - the computer still has to do the same amount of work. In this case, however, the flow of the code goes something like this...
send a request to the server...
... wait for a response...
... keep waiting....
... get a response and process it (or time out)
so probably 99% of the time the program is doing nothing except waiting for a response. It could just as easily be waiting for 20 responses as for 1.
Unfortunately there is no way of killing a thread in Python, which is a shame. The usual solution would be to have the thread polling for an exit flag, but it can't do that in this case since it is blocked on the socket call. You can do it if the socket is called with 'select', but since we are calling it through the urllib2 library we do not have that sort of low-level control of the socket. I have not looked at the urllib2 code, but it might be possible to subclass it to do this, although IMHO would be more trouble than it was worth.
Dave - The Developers' Coach
June 20th, 2004, 10:17 PM
I appreciate all the feedback. I tried the thread idea and it would still hang. I was not able to kill the thread, at least that is what I think was happening.
But I did find a solution using the signal module. I am able to set an alarm before each try catch statement and if the alarm goes off the url does not exists else the url exists. It seem sort of sloppy but it works well.
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError, "Couldn't open device!"
# Set the signal handler and a 5-second alarm
# This open() may hang indefinitely
print "does not exist"