|
|
||||||||||||||
|
| ||||||||||||||
If you would like to prevent Girafa from indexing your Web site, or a specific page of your Web site, you can easily do so by taking advantage of the Robots Exclusion Standard and setting up a robots.txt file, or a robots META tag. It is a simple process that gives you complete control over what gets indexed by Web crawlers. To remove existing URLs from the Girafa database, you need to:
Robots Exclusion Instructions If you wish your Web site not to be indexed by our crawler you can place a file at the root of your server called robots.txt. The robots.txt file MUST be located in the root directory of a given Web site. The contents of the robots.txt file would be:
User-agent: girafa
Alternatively, if you want to exclude the Girafa crawler from specific pages, you can add the following meta tag into the head section of the source code for the Web page: For more information on robots exclusion visit: http://www.robotstxt.org/wc/norobots.html |