#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2005
    Posts
    19
    Rep Power
    0

    CURL: Get final URL after inital URL redirects


    Hi,

    I need to find the redirected or 'final' URL of an initial URL which has a 302 redirect in it.

    User goes to: www.1.com and gets redirected to www.2.com, I need to get 'www.2.com'

    I think this can be done using CURL. Here's what I have so far, which is returning the original link...


    PHP Code:
    /*
     * Get a web file (HTML, XHTML, XML, image, etc.) from a URL.  Return an
     * array containing the HTTP server response header fields and content.
     */
    function get_web_page$url )
    {
        
    $options = array(
            
    CURLOPT_RETURNTRANSFER => true,     // return web page
            
    CURLOPT_HEADER         => false,    // don't return headers
            
    CURLOPT_FOLLOWLOCATION => true,     // follow redirects
            
    CURLOPT_ENCODING       => "",       // handle all encodings
            
    CURLOPT_USERAGENT      => "spider"// who am i
            
    CURLOPT_AUTOREFERER    => true,     // set referer on redirect
            
    CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
            
    CURLOPT_TIMEOUT        => 120,      // timeout on response
            
    CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects
        
    );

        
    $ch      curl_init$url );
        
    curl_setopt_array$ch$options );
        
    $content curl_exec$ch );
        
    $err     curl_errno$ch );
        
    $errmsg  curl_error$ch );
        
    $header  curl_getinfo$ch );
        
    curl_close$ch );

        
    //$header['errno']   = $err;
       // $header['errmsg']  = $errmsg;
        //$header['content'] = $content;
        
    print($header[0]);
        return 
    $header;


    Anyone know how I can modify this function to get the final redirected URL result?

    Thanks so much for the help. I really appreciate it.
  2. #2
  3. Kage Bunshin
    Devshed Novice (500 - 999 posts)

    Join Date
    Aug 2005
    Location
    The Seven Seas Of Rhye
    Posts
    930
    Rep Power
    423
    I could be wrong, but I believe such information is returned in the header. You might want to turn 'headers' on, and 'follow redirects' off. Manually handle the redirct after extracting the new URL from the header.
    "Java makes impossible things possible, but makes easy things difficult." - Somebody
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2008
    Posts
    5
    Rep Power
    0
    I've just come across this thread after a looong fruitless search.

    So, here's how I did it. Took the code from zebe's post, and the suggestions from tagmanadvance. Didn't get the resolved url. However, turning the headers on and leaving the follow redirects on did the trick.

    Code:
    function get_web_page( $url ) 
    { 
        $options = array( 
            CURLOPT_RETURNTRANSFER => true,     // return web page 
            CURLOPT_HEADER         => true,    // return headers 
            CURLOPT_FOLLOWLOCATION => true,     // follow redirects 
            CURLOPT_ENCODING       => "",       // handle all encodings 
            CURLOPT_USERAGENT      => "spider", // who am i 
            CURLOPT_AUTOREFERER    => true,     // set referer on redirect 
            CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect 
            CURLOPT_TIMEOUT        => 120,      // timeout on response 
            CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects 
        ); 
    
        $ch      = curl_init( $url ); 
        curl_setopt_array( $ch, $options ); 
        $content = curl_exec( $ch ); 
        $err     = curl_errno( $ch ); 
        $errmsg  = curl_error( $ch ); 
        $header  = curl_getinfo( $ch ); 
        curl_close( $ch ); 
    
        //$header['errno']   = $err; 
       // $header['errmsg']  = $errmsg; 
        //$header['content'] = $content; 
        print($header[0]); 
        return $header; 
    }  
    $thisurl = "http://www.example.com/redirectfrom";
    $myUrlInfo = get_web_page( $thisurl ); 
    echo $myUrlInfo["url"];
    returns "http://www.example.com/redirectto"

    Comments on this post

    • tagmanadvance agrees
  6. #4
  7. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2003
    Location
    Chennai
    Posts
    160
    Rep Power
    12

    Exclamation


    Hi
    Is it possible to get the url after i submit the authentication credentials.
    Because i am getting the same login page returned when i try to give curl command.
    generaly when i login from the browser to the interface if the credentials are wrong it returns to the login page with some query strings like

    http://ipadaddres/loginpage.hrml?authfail=0
    or
    http://ipadaddres/loginpage.hrml?authfail=1
    or
    http://ipadaddres/loginpage.hrml?authfail=2
    or
    http://ipadaddres/loginpage.hrml?authfail=3

    Is it possible to get this url so that i can pinpoint the validation error.


    Originally Posted by teesed
    I've just come across this thread after a looong fruitless search.

    So, here's how I did it. Took the code from zebe's post, and the suggestions from tagmanadvance. Didn't get the resolved url. However, turning the headers on and leaving the follow redirects on did the trick.

    Code:
    function get_web_page( $url ) 
    { 
        $options = array( 
            CURLOPT_RETURNTRANSFER => true,     // return web page 
            CURLOPT_HEADER         => true,    // return headers 
            CURLOPT_FOLLOWLOCATION => true,     // follow redirects 
            CURLOPT_ENCODING       => "",       // handle all encodings 
            CURLOPT_USERAGENT      => "spider", // who am i 
            CURLOPT_AUTOREFERER    => true,     // set referer on redirect 
            CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect 
            CURLOPT_TIMEOUT        => 120,      // timeout on response 
            CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects 
        ); 
    
        $ch      = curl_init( $url ); 
        curl_setopt_array( $ch, $options ); 
        $content = curl_exec( $ch ); 
        $err     = curl_errno( $ch ); 
        $errmsg  = curl_error( $ch ); 
        $header  = curl_getinfo( $ch ); 
        curl_close( $ch ); 
    
        //$header['errno']   = $err; 
       // $header['errmsg']  = $errmsg; 
        //$header['content'] = $content; 
        print($header[0]); 
        return $header; 
    }  
    $thisurl = "http://www.example.com/redirectfrom";
    $myUrlInfo = get_web_page( $thisurl ); 
    echo $myUrlInfo["url"];
    returns "http://www.example.com/redirectto"

    With regards
    Chandar.V.Rao
    bangalore

IMN logo majestic logo threadwatch logo seochat tools logo