
June 14th, 2001, 10:47 AM
|
|
Apprentice Deity
|
|
Join Date: Jul 1999
Location: Niagara Falls (On the wrong side of the gorge)
Posts: 3,237

Time spent in forums: 4 m 8 sec
Reputation Power: 17
|
|
|
That sounds efficient on the surface, Pressly, but it's not. What if no machine sits at that IP? What if (like my home computer) I have a firewall that won't respond to an unapproved request? Then your process hangs there waiting for a response until it times out.
Also, many firewall systems will treat that as a hack attempt.
Not to mention that the majority of IPs do NOT have an HTTP server.
All this leads up to your server wasting a lot of resources doing nothing.
KC,
As well as following internal website links, spiders will also harvest external links and follow those as well. Once it gets started a spider could theoretically crawl forever following site links to site links.
Of course, you'll have to give it a list of URLs to start, try to pick link rich sites.
|