#16
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2016
    Location
    Cheshire, UK
    Posts
    90
    Rep Power
    72
    This is the code you have
    PHP Code:
    $html str_get_html($response_string);                   // $html is now a simple_html_dom object
    .
    .
    .
    $dom = new DOMDocument;
    if(
    $dom->loadHTML($htmlLIBXML_NOWARNING))               // $html is still an object when it should be HTML text.

    Why are using simple_html_dom then feeding it to DOM? Use one or the other.

    You really need to read your code and develop an awareness of what is in the variables.
  2. #17
  3. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Folks,

    On post number 7, I asked:
    And, how can I make this:
    Maximum execution time of 30 seconds exceeded
    exceed to the time I want ? Like 180 secs (3 mins) ?


    I need an answer from you on that once you have checked-out the script on that post. Note that, that script and the script you see on my previous post are not the same.
    I need an answer to my previous post too.

    Cheers for your TIMEs.
  4. #18
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by Barand
    This is the code you have
    PHP Code:
    $html str_get_html($response_string);                   // $html is now a simple_html_dom object
    .
    .
    .
    $dom = new DOMDocument;
    if(
    $dom->loadHTML($htmlLIBXML_NOWARNING))               // $html is still an object when it should be HTML text.

    Why are using simple_html_dom then feeding it to DOM? Use one or the other.

    You really need to read your code and develop an awareness of what is in the variables.
    Because, I grabbed a code from this tutorial while learning a thing or 2 about the dom thingy:
    Top 10 Best Usage Examples of PHP Simple HTML DOM Parser
    That is where the simple_html_dom.php came from.

    And, I got help from here:
    https://stackoverflow.com/questions/...by-php-or-curl
    That is where the DOM came from. I thought the DOM and the simple_html_dom were the same thing. Thanks for bringing this mistake of mine to my attention.

    Notice the 2 codes on these 2 links. And then notice my 2 scripts.
    I grabbed those 2 scripts from those 2 links and then nested my own foreach loops. Copy & paste. But I do understand the code. CONFESSION: Understand most of it.
    If you have any doubts then ask me 3 questions about them and I'll answer. That way, I won't have you complaining like others that I cut & paste without understanding anything.
    I do cut & paste at first. That way, I get a little lead by working on others' skeletons. Then, I bother the forum(s) for an explanation of a line or few lines and learn what I did not know. Get work experience that way.
    If you don't mind Barand, do you mind fixing both these codes for me by adding comments and then adding them in this thread for present & future newbies to learn ?
    Your sample would be very useful and a great progression for us newbies. Would be most appreciated. It would speeden-up our learnings and work/practical experiences.

    Thanks!

    EDIT: Barand, since I used simple_html_dom.php and the DOM on both. How-about splitting things up so one is used on one script and the other on the other ?
    That way, we newbies learn from you about how to use both the simple_html_dom.php and the DOM correctly. I think this is a good idea! What do you think ? Kill 2 birds with a single sling shot! That'll be the day!
    Last edited by UniqueIdeaMan; May 21st, 2018 at 04:31 PM.
  6. #19
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0

    Unhappy


    Googling now for tutorial with KWs:
    a simple web crawler php tutorial

    In the past, I downloaded from youtube 13 vids on building your own web crawler with php and 17 vids on building your own searchengine.
    Not in the mood to be watching all these vids, right now. And so, googling for text tutorials.

    EDIT: Found this on google and am going through it. Have a feeling it was written by non-native English speaker. Grammer is appalling!
    Create Simple Web Crawler Using PHP And MySQL
    Just one hindrance after another!
    Last edited by UniqueIdeaMan; May 21st, 2018 at 04:50 PM.
  8. #20
  9. Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2006
    Posts
    2,682
    Rep Power
    1841
    Not linked to your other problems, which I cannot help with as a) I don't do php and b) I really don't do OOP ... but please, for the sake of my sanity if nothing else, consider changing:
    Code:
    if($current_link_crawling_level == $link_crawling_level_max)
    to
    Code:
    if($current_link_crawling_level >= $link_crawling_level_max)
    I get very twitchy when range/loop limits are tested for equality. Not saying it'd happen but what if that variable got 'bumped' one higher and managed to exceed the limit value. Off your script would go following links all across the internet until ....?

    Comments on this post

    • UniqueIdeaMan agrees : Agree!
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  10. #21
  11. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by SimonJM
    Not linked to your other problems, which I cannot help with as a) I don't do php and b) I really don't do OOP ... but please, for the sake of my sanity if nothing else, consider changing:
    Code:
    if($current_link_crawling_level == $link_crawling_level_max)
    to
    Code:
    if($current_link_crawling_level >= $link_crawling_level_max)
    I get very twitchy when range/loop limits are tested for equality. Not saying it'd happen but what if that variable got 'bumped' one higher and managed to exceed the limit value. Off your script would go following links all across the internet until ....?
    Yes, I know what you mean. I had different results when testing with "=" and "=>" when I used to build .exe bots with Ubot Studio.
    So, I was gonna experiment before actually finalising my code. But still, thanks for confirming what it should be. You saved my time from experimenting again and re-learning what I forgot.
  12. #22
  13. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    SimonJM,

    Do you know the answer to this:
    http://forums.devshed.com/php-develo...ml#post2985597

    Anyone else know how to fix this issue ?
  14. #23
  15. Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2006
    Posts
    2,682
    Rep Power
    1841
    Originally Posted by UniqueIdeaMan
    SimonJM,

    Do you know the answer to this:
    http://forums.devshed.com/php-develo...ml#post2985597

    Anyone else know how to fix this issue ?
    One of the benefits of being 'an old person' is that I never really had to learn php, so as much as I'd like to help I would not know how to. If I were doing it, I'd look at the hows and whys of spawning sub-processes and the ability to 'politely' kill them off from a master (or sibling?) process if still executing after a fixed amount of time. That, presumably, then requires the ability to accurately track, and identify, such processes. Can php spawn child processes and if so how can the parent process identify that child?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  16. #24
  17. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Php Folks,

    Just by glancing over this script, can you spot any errors ? Let us see ho good you are at spotting errors.
    PHP Code:
     $main_url="https://developers.google.com/youtube/";
     
    $str file_get_contents($main_url);
     
     
    // Gets Webpage Title
     
    if(strlen($str)>0)
     {
        
    $str trim(preg_replace('/\s+/'' '$str)); // supports line breaks inside <title>
        
    preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title); // ignore case
        
    $title=$title[1];
     }
        
     
    // Gets Webpage Description
     
    $b =$main_url;
     @
    $url parse_url$b );
     @
    $tags get_meta_tags($url['scheme'].'://'.$url['host'] );
     
    $description=$tags['description'];
        
     
    // Gets Webpage Internal Links
     
    $doc = new DOMDocument
     @
    $doc->loadHTML($str); 
     
     
    $items $doc->getElementsByTagName('a'); 
     foreach(
    $items as $value
     { 
        
    $attrs $value->attributes
        
    $sec_url[]=$attrs->getNamedItem('href')->nodeValue;
     }
     
    $all_links=implode(",",$sec_url);
     
     
    // Store Data In Database
     
    $host="localhost";
     
    $username="root";
     
    $password="";
     
    $databasename="crawler_index";
     
    $conn=mysqli_connect($host,$username,$password);
     
    $db=mysqli_select_db($conn,"$databasename");

     
    mysql_query("insert into webpage_details values('$main_url','$title','$description','$all_links')");

    ?> 
    Without testing the script on wamp/lamp/xamp, which lines are in error you reckon ?
    Last edited by UniqueIdeaMan; May 27th, 2018 at 02:51 PM.
  18. #25
  19. Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2006
    Posts
    2,682
    Rep Power
    1841
    Not an error, but your indentation sucks.
    Not an error, bit using root to connect, and having that without a password?
    Don't you actually need to tell curl to do something?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  20. #26
  21. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by SimonJM
    Not an error, but your indentation sucks.
    Not an error, bit using root to connect, and having that without a password?
    Don't you actually need to tell curl to do something?
    Not my script. Thanks for bringing the indentation to my attention.
    Are you sure you see no errors ? Because, I see this as an error and get the error:
    PHP Code:
    $title=$title[1]; 
    Code came from tutorial here:
    Create Simple Web Crawler Using PHP And MySQL
  22. #27
  23. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Originally Posted by SimonJM
    One of the benefits of being 'an old person' is that I never really had to learn php, so as much as I'd like to help I would not know how to. If I were doing it, I'd look at the hows and whys of spawning sub-processes and the ability to 'politely' kill them off from a master (or sibling?) process if still executing after a fixed amount of time. That, presumably, then requires the ability to accurately track, and identify, such processes. Can php spawn child processes and if so how can the parent process identify that child?
    Mmm. Thanks. I have not gotten into spawning kids yet on php.
  24. #28
  25. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jan 2017
    Posts
    845
    Rep Power
    0
    Php Folks,

    Can anyone fix this code at post 24 that I found on the tutorial mentioned in my above post at 26 ?I get error about that unorthodox array mentioned on my post 26.
    http://forums.devshed.com/php-develo...ml#post2985723
  26. #29
  27. No Profile Picture
    Contributing User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jul 2003
    Posts
    4,470
    Rep Power
    653
    Originally Posted by UniqueIdeaMan
    Php Folks,

    Can anyone fix this code at post 24 that I found on the tutorial mentioned in my above post at 26 ?url]
    I'm sure someone on the Hire a Programmer forum can.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  28. #30
  29. Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2006
    Posts
    2,682
    Rep Power
    1841
    Originally Posted by UniqueIdeaMan
    Not my script. Thanks for bringing the indentation to my attention.
    Are you sure you see no errors ? Because, I see this as an error and get the error:
    PHP Code:
    $title=$title[1]; 
    Code came from tutorial here:
    Create Simple Web Crawler Using PHP And MySQL
    As I have said, I don't do php, and have no way of running any php code. I did wonder about using $title as both a string variable and an array but as you have used this code (or something very similar to it) I 'overlooked' it; and don't even know if that IS the issue you get, ad you didn't bother saying what error you get!
    You are also going to get the same 'undefined index' error with 'description' if the tag does not contain a description element. Can you (should you?) preload the target array with dummy (blank/null) values for any element that you need to be there?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc

IMN logo majestic logo threadwatch logo seochat tools logo