#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2000
    Posts
    79
    Rep Power
    15
    I have written a crawler in php which scans pages for certain file links.

    The problem is...when i run it, its uses 99% of the servers CPU, which pisses off the people hosting me as you could imagine. Does anyone know what the problem could be or how i could limit the CPU it uses??

    I'm not going to paste my code as i dont think it is a problem within the code because a friend wrote a script which does the same kinda thing with -completely- different code and the same problem still occurs.

    Any help would be very greatly appreciated.
    Thanks in advance

    Basil.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2000
    Location
    Washington, USA
    Posts
    52
    Rep Power
    15
    Is there logic built-in to kill the socket after a set period of time? Too many simultaneous open pending connections can burn CPU big time.


    ------------------
    <UL TYPE=SQUARE>
    <LI> TD Scripts
    <LI> Script School
    <LI>php-scripts
    </UL>
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2000
    Location
    London/UK
    Posts
    91
    Rep Power
    15
    Also, you could run it via a wrapper that executes the script with a low priority (assuming you've got php as a cgi)

    ie. nice --10



    ------------------
    http://back-end.org
  6. #4
  7. No Profile Picture
    freebsd
    Guest
    Devshed Newbie (0 - 499 posts)
    >>Does anyone know what the problem could be or how i could limit the CPU it uses??

    Start here -> http://www.devshed.com/Talk/Forums/F...ML/000360.html

    If your php crawler script is taking too much time to complete the execution, try to write it in Perl and use LWP which should be _significantly_ faster than doing the same task with PHP. PHP was never meant to do advanced web crawler task.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2000
    Posts
    79
    Rep Power
    15
    I don't know any perl and dont really have the time to learn it...

    I put sleep(1) in the main loop of the crawler and it dropped back to around 20% with the occasional rise....i can live with this.

Similar Threads

  1. php script won't work inside "cgi-local" directory
    By Volitics in forum PHP Development
    Replies: 5
    Last Post: April 26th, 2004, 10:50 PM
  2. How to see the Memory used by a PHP script?
    By Pardall in forum PHP Development
    Replies: 4
    Last Post: February 13th, 2004, 03:54 PM
  3. Change this PHP script to use NETPBM instead of GD
    By hknight in forum Project Help Wanted
    Replies: 1
    Last Post: February 10th, 2004, 11:46 AM
  4. Replies: 1
    Last Post: February 4th, 2004, 01:39 PM
  5. Executing PHP script using crontab with PHP module
    By crazyIvan in forum PHP Development
    Replies: 0
    Last Post: January 23rd, 2004, 08:27 AM

IMN logo majestic logo threadwatch logo seochat tools logo