A robots.txt is a file put on the server that tells various search engine bots not to crawl or index some of the sections or webpages of that particular website. The website owner can use robot.txt file to prevent indexing completely, to prevent some of the portions of the website from being indexed or to display individual instructions about indexing to a particular search engine.
The file robots.txt itself is a simple text file that can be made in simple NotePad. It needs to be kept in the root directory of the website, which is the directory wherein the homepage or indexed page is stored. All search engines, or at least all the popular ones, now observe a robots.txt file as soon as the spiders or searchbots come on the website. Therefore, even if we do not currently require excluding the spiders from any portion of the website, to have a robots.txt file is yet a good way as it can serve as a kind of invitation to the website.
There are many situations where the webmaster or website owner might wish to exclude spiders from some portion or the entire site. Some of them are mentioned below:-
When the website is still in primitive stage, or some of the web pages, and you do not want that bare or raw work to be visible in search engines
You got information that, though not sensitive enough to worry about protecting the password, it is of no interest to anybody but that it is meant for and you would like not to appear on the search engines.
Most of the people will have some directories that they do not want to be crawled. - for instance, no body would like to have their cgi-bin to get indexed! Or to index a directory that simply says thankyou or shows some errors on the web page.
If you have been using a doorway page (similar page, each one is optimized for separate search engine) you might want to be sure that individual robot does not have any navigation to all of them. This is essential in order to keep the penalization away for spamming search engines having a series of pages that are quite similar.
Many website owners may want to exclude some searchbots or spider altogether, for instance those from search engines you may not wish to appear in or those with the intention of gathering email addresses and nothing else!
Have you observed your website statistics just recently? If the statistics include a section on 'files not found', be ready to see many entries wherein search engine spiders crawled for, and failed to evaluate a robots.txt file on the website. So let us start moving. You can create a regular text file known as a "robots.txt", and be sure that is labeled exactly the same. The file should be uploaded to the root navigation directory of the website and never to be a subdirectory (i.e. http://www.yourwebsitedomain.com but NOT http://www.yourwebsitedomain.com/somestuff/).