November 17th, 2003, 11:43 PM
heres the problem, i need to retrieve a webpage given a url.
h = httplib.HTTP(hostport)
is a problem because if the url is "http://www.whatever.com", then absolute path is "" and that wont work.
h = urllib.urlopen(URL)
is a problem because it wont detect a 404 error!
i need a little bit of both. there must be some way to get the status code using urlopen, right?
November 18th, 2003, 02:21 AM
Surly all you have to do with urllib is to check for the 404 or 500 error strings (the page title would be the easiest thing to test) couldnt be simpler.
If you really want a module to handle this i think urllib2 should work
Another way to do this would be to manipulate the abspath, python has a greate module to do this kind of thing! (os.path)
Oh, i've tried both ways (httplib and urllib) how exactly are you detecting these errors? No special headers are sent as far as i can see..
November 18th, 2003, 02:46 AM
Did you try to go in opposite way - use urllib.URLopener, and proceed HTTP-EQUIV if needed?
Last edited by Igor Pechersky; November 18th, 2003 at 02:50 AM.
November 18th, 2003, 05:22 AM
I guess it's a Zope thing so you use 1.5.2 code?
But in 2.x ...
hostport = 'www.bbc.co.uk'
abspath = ''
# abspath = '/' works too
h = httplib.HTTPConnection(hostport)
resp = h.getresponse()
if resp.status == 200:
data = resp.read()
November 18th, 2003, 10:05 PM
im actually using urllib2 now and it works fine.
the only thing is that i need to detect a timeout.
how to do this besides getting down to the socket level?