onsdag den 20. august 2014

Disable indexing of an entire site using HTTP headers

Sometimes a customer wants to prevent an entire website from being indexed (like when they are creating content before the site goes live).

If the site is hosted on the same Sitecore instance as other websites, using robots.txt is not an option.

So here is another way, that is better, and that isn't an all-or-nothing approach.

First, take the template defining your frontpage item, and add a new checkbox field to this template - call the field "Not Indexable".

Next, create a new class in Visual Studio, and add the following code:

public class NotIndexableProcessor : HttpRequestProcessor
{
    public override void Process(HttpRequestArgs args)
    {
        if (args == null || args.Context == null)
        {
            return;
        }

        Item homeItem = Sitecore.Context.Database.GetItem(Sitecore.Context.Site.StartPath);

        if (homeItem == null)
        {
            return;
        }

        CheckboxField notIndexableField = homeItem.Fields["Not Indexable"];
        if (notIndexableField == null || !notIndexableField.Checked)
        {
            return;
        }

        args.Context.Response.Headers["X-Robots-Tag"] = "noindex, nofollow";
    }
}

Now you just need to create an include file, for Sitecore to run this.
So do this in your favorite XML editor, and add the following text to it:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:x="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <httpRequestBegin>
        <processor type="Namespace.Classname, Assembly" patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']"/>
      </httpRequestBegin>
    </pipelines>
  </sitecore>
</configuration>

Replace the Namespace, Classname and Assembly, so they match your setup.

Now, if this checkbox is checked, and an item below it is accessed, the X-Robots-Tag header will be added to the response, and search engines will not index anything on it.

Ingen kommentarer:

Send en kommentar