|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Robots.txt not found
File does not exist: /home/var/www/html/robots.txt
What is this? |
|
#2
|
|||
|
|||
|
what does this have to do with search engine optimization? and unless you dont understand english i think it is pretty clear, the file /home/var/www/html/robots.txt doesnt exit, check if it exists
|
|
#3
|
|||
|
|||
|
change /home/var/www/html/ to http://www.YOURDOMAIN.COM then add /robots.txt
so it should look like http://www.yourdomain.com/robots.txt
__________________
Free Forum hosting for clans |
|
#4
|
|||
|
|||
|
But why MSNBot searches in this file...? I think I must write something to get rank to my site..
|
|
#5
|
|||
|
|||
|
are you sure youve made it correctly?
|
|
#6
|
|||
|
|||
|
I don't do anything...I just see this in my error.log
|
|
#7
|
|||
|
|||
|
robots.txt is a text file that most spiders check for. It specifies which pages of your site you do not want them to search/index. Do a google search to find out how to use it.
|
|
#8
|
|||
|
|||
|
robots.txt
Don't worry, I also wondered what the hell it was and my first idea was to add some keywords in the file. It actuall has to do with excluding your site from robots.
On www.robotstxt.org they say the following: "Why do I find entries for /robots.txt in my log files? They are probably from robots trying to see if you have specified any rules for them using the Standard for Robot Exclusion, see also below. If you don't care about robots and want to prevent the messages in your error logs, simply create an empty file called robots.txt in the root level of your server. Don't put any HTML or English language "Who the hell are you?" text in it -- it will probably never get read by anyone :-) How do I prevent robots scanning my site? The quick way to prevent robots visiting your site is put these two lines into the /robots.txt file on your server: User-agent: * Disallow: / but its easy to be more selective than that. Where do I find out how /robots.txt files work? You can read the whole standard specification but the basic concept is simple: by writing a structured text file you can indicate to robots that certain parts of your server are off-limits to some or all robots. It is best explained with an example: # /robots.txt file for http://webcrawler.com/ # mail webmaster@webcrawler.com for constructive criticism User-agent: webcrawler Disallow: User-agent: lycra Disallow: / User-agent: * Disallow: /tmp Disallow: /logs The first two lines, starting with '#', specify a comment The first paragraph specifies that the robot called 'webcrawler' has nothing disallowed: it may go anywhere. The second paragraph indicates that the robot called 'lycra' has all relative URLs starting with '/' disallowed. Because all relative URL's on a server start with '/', this means the entire site is closed off. The third paragraph indicates that all other robots should not visit URLs starting with /tmp or /log. Note the '*' is a special token, meaning "any other User-agent"; you cannot use wildcard patterns or regular expressions in either User-agent or Disallow lines. Two common errors: * Wildcards are _not_ supported: instead of 'Disallow: /tmp/*' just say 'Disallow: /tmp/'." * You shouldn't put more than one path on a Disallow line (this may change in a future version of the spec) |
![]() |
| Viewing: Dev Shed Forums > Web Design > Search Engine Optimization > Robots.txt not found |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|