FAQ - How do I prevent the FusionBot spider from indexing certain pages on my site?

How do I prevent the FusionBot spider from indexing certain pages on my site?

You can prevent our spider from indexing certain pages and frames by using either a ROBOTS meta tag placed in the <HEAD> section of your page's HTML code, as discussed in this FAQ, or a robots.txt file placed in your site's root directory, or populating the Robots Exclusion Form from within your FusionBot account.

If you do not wish to exclude an entire page from your index, but rather, only a specific portion or section of content within a page, please reference the following FAQ.

A robots meta tag with content="NOINDEX" will result in the page not being indexed. A robots meta tag with content="NOFOLLOW" will result in the page being indexed, however, the crawler will not follow any links from this page to other pages on your site. Finally, a robots meta tag with content="NOINDEX,NOFOLLOW" will result in neither the page nor it's links being spidered.

Following are a few examples:

<META name="ROBOTS" content="NOINDEX">
<META name="ROBOTS" content="NOFOLLOW">
<META name="ROBOTS" content="NOINDEX,NOFOLLOW">

Or, for FusionBot only:

<META name="ROBOTS" content="NOINDEXFB">
<META name="ROBOTS" content="NOFOLLOWFB">
<META name="ROBOTS" content="NOINDEXFB,NOFOLLOWFB">

You can also specify a custom value of the content attribute for the ROBOTS Meta tag, exclusive to FusionBot, which will instruct the crawler to not include a specific page as part of your FusionBot generated sitemap, while still including this page in your actual search results. To exclude a page from your sitemap listing, construct your ROBOTS Meta Tag with the following content attribute value:

<META name="ROBOTS" content="NOSITEMAP">

The NOSITEMAP value can be used in combination with any of the additional ROBOTS Meta tag options described above.

The robots.txt file can also be used to prevent a spider from indexing certain pages or directories. This file, which is placed in your site's root directory, contains DISALLOW commands which will determine which pages/directories the spider will be permitted to index.

Populating your Robots Exclusion Form from within the FusionBot Member Center is identical in configuration to creating a robots.txt file, however, the Robots Exclusion Form eliminates the need to publish a robots.txt file to your web server. If you don't already have a robots.txt file in place, or cannot easily create and publish a robots.txt file to your web server, the Robots Exclusion Form may be your best option.

Note: Implementing a Robots.txt file or populating your Robots Exclusion Form will have the same effect as using the combination of NOINDEX and NOFOLLOW in your META tags without the need to modify each page you wish to apply these instructions to. Since this approach will disallow the crawler from accessing a page all together, not only will the specified page not be included in your index, any pages that are linked from the specified page that may not be linked elsewhere in your site will also not be found / included in your index.

Click Here to view our FAQ concerning robots.txt and/or Robots Exclusion Form implementation.

For a detailed explanation of how the robots.txt file can be implemented, please visit https://www.robotstxt.org/robotstxt.html.

Login

Subscribe

How do I prevent the FusionBot spider from indexing certain pages on my site?

<< Previous FAQ Back to FAQ List Next FAQ >>

How do I prevent the FusionBot spider from indexing certain pages on my site?

<< Previous FAQBack to FAQ ListNext FAQ >>

<< Previous FAQ Back to FAQ List Next FAQ >>