Search Engine Optimization
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsWeb DesignSearch Engine Optimization

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 10th, 2004, 09:06 AM
redstar redstar is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2003
Posts: 604 redstar User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 Days 9 h 5 m 57 sec
Reputation Power: 6
Send a message via ICQ to redstar Send a message via AIM to redstar Send a message via MSN to redstar Send a message via Yahoo to redstar
Robots.txt not found

File does not exist: /home/var/www/html/robots.txt

What is this?

Reply With Quote
  #2  
Old October 10th, 2004, 09:18 AM
PacketManiac PacketManiac is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2004
Location: Vancouver, Canada
Posts: 51 PacketManiac User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 37 m 45 sec
Reputation Power: 4
Send a message via MSN to PacketManiac
what does this have to do with search engine optimization? and unless you dont understand english i think it is pretty clear, the file /home/var/www/html/robots.txt doesnt exit, check if it exists

Reply With Quote
  #3  
Old October 10th, 2004, 09:35 AM
tony84 tony84 is offline
tony
Dev Shed Novice (500 - 999 posts)
 
Join Date: Apr 2004
Location: manchester uk
Posts: 670 tony84 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 13 h 11 m 38 sec
Reputation Power: 5
Send a message via AIM to tony84 Send a message via MSN to tony84
change /home/var/www/html/ to http://www.YOURDOMAIN.COM then add /robots.txt


so it should look like http://www.yourdomain.com/robots.txt
__________________
Free Forum hosting for clans

Reply With Quote
  #4  
Old October 11th, 2004, 03:23 PM
redstar redstar is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2003
Posts: 604 redstar User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 Days 9 h 5 m 57 sec
Reputation Power: 6
Send a message via ICQ to redstar Send a message via AIM to redstar Send a message via MSN to redstar Send a message via Yahoo to redstar
But why MSNBot searches in this file...? I think I must write something to get rank to my site..

Reply With Quote
  #5  
Old October 11th, 2004, 03:43 PM
tony84 tony84 is offline
tony
Dev Shed Novice (500 - 999 posts)
 
Join Date: Apr 2004
Location: manchester uk
Posts: 670 tony84 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 13 h 11 m 38 sec
Reputation Power: 5
Send a message via AIM to tony84 Send a message via MSN to tony84
are you sure youve made it correctly?

Reply With Quote
  #6  
Old October 12th, 2004, 02:41 PM
redstar redstar is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jul 2003
Posts: 604 redstar User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 Days 9 h 5 m 57 sec
Reputation Power: 6
Send a message via ICQ to redstar Send a message via AIM to redstar Send a message via MSN to redstar Send a message via Yahoo to redstar
I don't do anything...I just see this in my error.log

Reply With Quote
  #7  
Old October 13th, 2004, 03:07 PM
Aronya Aronya is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jun 2004
Posts: 128 Aronya User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 56 m 16 sec
Reputation Power: 5
robots.txt is a text file that most spiders check for. It specifies which pages of your site you do not want them to search/index. Do a google search to find out how to use it.

Reply With Quote
  #8  
Old November 6th, 2004, 02:37 AM
Koos Koos is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2003
Location: South Africa
Posts: 2 Koos User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
robots.txt

Don't worry, I also wondered what the hell it was and my first idea was to add some keywords in the file. It actuall has to do with excluding your site from robots.
On www.robotstxt.org they say the following:
"Why do I find entries for /robots.txt in my log files?
They are probably from robots trying to see if you have specified any rules for them using the Standard for Robot Exclusion, see also below.

If you don't care about robots and want to prevent the messages in your error logs, simply create an empty file called robots.txt in the root level of your server.

Don't put any HTML or English language "Who the hell are you?" text in it -- it will probably never get read by anyone :-)
How do I prevent robots scanning my site?
The quick way to prevent robots visiting your site is put these two lines into the /robots.txt file on your server:

User-agent: *
Disallow: /

but its easy to be more selective than that.
Where do I find out how /robots.txt files work?
You can read the whole standard specification but the basic concept is simple: by writing a structured text file you can indicate to robots that certain parts of your server are off-limits to some or all robots. It is best explained with an example:

# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism

User-agent: webcrawler
Disallow:

User-agent: lycra
Disallow: /

User-agent: *
Disallow: /tmp
Disallow: /logs

The first two lines, starting with '#', specify a comment

The first paragraph specifies that the robot called 'webcrawler' has nothing disallowed: it may go anywhere.

The second paragraph indicates that the robot called 'lycra' has all relative URLs starting with '/' disallowed. Because all relative URL's on a server start with '/', this means the entire site is closed off.

The third paragraph indicates that all other robots should not visit URLs starting with /tmp or /log. Note the '*' is a special token, meaning "any other User-agent"; you cannot use wildcard patterns or regular expressions in either User-agent or Disallow lines.

Two common errors:

* Wildcards are _not_ supported: instead of 'Disallow: /tmp/*' just say 'Disallow: /tmp/'."
* You shouldn't put more than one path on a Disallow line (this may change in a future version of the spec)

Reply With Quote
Reply

Viewing: Dev Shed ForumsWeb DesignSearch Engine Optimization > Robots.txt not found


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway