August 22nd, 2000, 11:14 AM
I have written a crawler in php which scans pages for certain file links.
The problem is...when i run it, its uses 99% of the servers CPU, which pisses off the people hosting me as you could imagine. Does anyone know what the problem could be or how i could limit the CPU it uses??
I'm not going to paste my code as i dont think it is a problem within the code because a friend wrote a script which does the same kinda thing with -completely- different code and the same problem still occurs.
Any help would be very greatly appreciated.
Thanks in advance
August 22nd, 2000, 03:02 PM
Is there logic built-in to kill the socket after a set period of time? Too many simultaneous open pending connections can burn CPU big time.
<LI> TD Scripts
<LI> Script School
August 22nd, 2000, 07:32 PM
Also, you could run it via a wrapper that executes the script with a low priority (assuming you've got php as a cgi)
ie. nice --10
August 23rd, 2000, 10:35 AM
>>Does anyone know what the problem could be or how i could limit the CPU it uses??
Start here -> http://www.devshed.com/Talk/Forums/F...ML/000360.html
If your php crawler script is taking too much time to complete the execution, try to write it in Perl and use LWP which should be _significantly_ faster than doing the same task with PHP. PHP was never meant to do advanced web crawler task.
August 23rd, 2000, 11:19 AM
I don't know any perl and dont really have the time to learn it...
I put sleep(1) in the main loop of the crawler and it dropped back to around 20% with the occasional rise....i can live with this.