#1
  1. No Profile Picture
    I hate nerds
    Devshed Novice (500 - 999 posts)

    Join Date
    Jul 2003
    Posts
    540
    Rep Power
    0

    network programming!


    heres the problem, i need to retrieve a webpage given a url.

    using

    h = httplib.HTTP(hostport)
    h.putrequest('GET', abspath)

    is a problem because if the url is "http://www.whatever.com", then absolute path is "" and that wont work.

    using

    h = urllib.urlopen(URL)

    is a problem because it wont detect a 404 error!

    i need a little bit of both. there must be some way to get the status code using urlopen, right?
  2. #2
  3. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    Surly all you have to do with urllib is to check for the 404 or 500 error strings (the page title would be the easiest thing to test) couldnt be simpler.

    If you really want a module to handle this i think urllib2 should work

    Another way to do this would be to manipulate the abspath, python has a greate module to do this kind of thing! (os.path)

    Oh, i've tried both ways (httplib and urllib) how exactly are you detecting these errors? No special headers are sent as far as i can see..

    Mark.
    programming language development: www.netytan.com Hula

  4. #3
  5. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2003
    Posts
    28
    Rep Power
    0
    Did you try to go in opposite way - use urllib.URLopener, and proceed HTTP-EQUIV if needed?
    Last edited by Igor Pechersky; November 18th, 2003 at 02:50 AM.
  6. #4
  7. Mini me.
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Location
    Cambridge, UK
    Posts
    783
    Rep Power
    13
    I guess it's a Zope thing so you use 1.5.2 code?

    But in 2.x ...

    import httplib
    hostport = 'www.bbc.co.uk'
    abspath = ''
    # abspath = '/' works too

    h = httplib.HTTPConnection(hostport)
    h.request('GET', abspath)
    resp = h.getresponse()

    if resp.status == 200:
    data = resp.read()
    print data
    else:
    print "Error"
  8. #5
  9. No Profile Picture
    I hate nerds
    Devshed Novice (500 - 999 posts)

    Join Date
    Jul 2003
    Posts
    540
    Rep Power
    0
    thanks guys.

    im actually using urllib2 now and it works fine.

    the only thing is that i need to detect a timeout.

    how to do this besides getting down to the socket level?

IMN logo majestic logo threadwatch logo seochat tools logo