Search engine spider crawl your
periodically. How to limit the access of the spider? So the search
engine only crawled the page pages you want to show and leave others
ignored. What we need is a file called “robots.txt”. The robot text file
is used to disallow the search engine spider’s access to folders or
pages that you don’t want to be indexed.
For example, you
may create some personal pages or you may have some company web pages,
which is for internal use only. You may also do not want to expose the
You just need to upload “robots.txt” to the root of your web site.
Use a text editor (e.g. NotePad) to create a new file named “robots.txt”. The syntax is
You can use * acts as a wildcard and disallows all spiders.
Visit http://www.robotstxt.org/wc/active/html/ to find the name of user
To disallow an entire directory use
To disallow an individual file use
For multiple files, you need to write in separate lines.