Saturday, 16 July 2011

What is robot.txt File?

Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site

Structure of a Robots.txt File : -
User-agent:
Disallow:
User-agent” are search engines' crawlers and disallow: lists the files and directories to be excluded from indexing. In addition to “user-agent:” and “disallow:” entries, you can include comment lines – just put the # sign at the beginning of the line:
# All user agents are disallowed to see the /temp directory.
User-agent: *
Disallow: /temp/

No comments:

Post a Comment