#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2017
    Posts
    7
    Rep Power
    0

    How can I get the destination URL using cURL?


    hello dear community,


    How can I get the destination URL using cURL? I'm working on a PHP scraper to do the following:

    - cURL several (always fewer than 10) URLs,
    - Add the HTML from each URL to a DOMDocument,
    - Store the hrefs for matching elements in an array.


    well i think that i fetch alot of URLs, - afterwards i have to try to iterate through the result to find ` elements linking

    I've tried my parser code on a single cURL and it works
    (returns an array with the URLs for each pdf on that page).


    first of all i have to develope a cURL code:

    Code:
    $urls = Array( 
     
     'http://www.example1.com/foo_bar/1.htm', 
     'http://www.example2.com/foo_bar/2.htm',
     'http://www.example3.com/foo_bar/3.htm',
     'http://www.example4.com/foo_bar/4.htm' 
     'http://www.example5.com/foo_bar/1.htm', 
     'http://www.example6.com/foo_bar/2.htm',
     'http://www.example7.com/foo_bar/3.htm',
     'http://www.example8.com/foo_bar/4.htm' 
    
     );
    now i have to find a regular expression that filters out the foo_bar - in other
    words - helps to find each URL that contains the _foo_bar


    i need to store the results.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,362
    Rep Power
    630
    I believe you want to use array-keys. It will produce an array of the keys of all the URLs containing the desired string. Use that array to extract the actual URLs from the original array.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2017
    Posts
    7
    Rep Power
    0
    hello dear gw1500 se

    many thanks for the reply. The thing is: i have no ready-made-set of URLs - i need to fetch all the ones in the net.
    So i need to run a request to Google

    _Or : how would you try to achieve the "Fetching" part?

    love to hear from you
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,362
    Rep Power
    630
    That is different from your OP. Are you asking how to use DOM to parse the result of the Google search? Your OP implied you were already successful doing that.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2017
    Posts
    7
    Rep Power
    0
    good evening dear gw1500se

    many thanks for your quick reply.

    youre right: i am in front of the "Fetching" part.


    first: fechting
    then; parsing
    finally writing the results in a file or
    storing the data in a db

    well - the fetching is the first task that i have to manage..
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,362
    Rep Power
    630
    OK, so you need to use the DOM parser. Try this article.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.

IMN logo majestic logo threadwatch logo seochat tools logo