ROBOTS.TXT
Robots, Crawlers, Spiders etc are all recognized with the same. These crawlers indexes our website in their search engine databases. Google robots are called as Googlebot, Yahoo robots are called as Yahoo Slurp and MSN robots are called as MSN bot.
robots.txt is a text file which is created at the time of creating a website. This name is case sensitive, maintain all the letters to be lower case.
We give the instruction to the search engine through this file robots.txt file. robots.txt file guides the search engines whether to include or exclude the particular file or directory in its database. We create this file to hide the sensitive information from displaying them in the search engine results.
We place this file robots.txt in the root directory. That means we place this file besides the index file of our website.
How to create a robots.txt file ?
Example: robots.txt example
User-agent: *
Disallow: /examples_folder/example.html
Disallow: instructs the crawler not to index the file /example_folder/example.html
If there are many files which you don't want them to be crawlerd. we can give many Disallow: command for every file.
Example:
User-agent: *
Disallow: /examples_folder/example.html
Disallow: /examples_folder/example22.html
Disallow: /howto/sample.html
If you want all the files from a folder not be crawled by the search engines. The we give the command as Disallow: /example_folder. All the files inside this folder will not be indexed.
NOTE: There is an another process of giving the instruction. We do that through <meta> tag in the head section of every page.
