Prevent Indexing with HTML head meta name=”robots” content=”noarchive”

html-head-meta-name-robots-content-noarchiveHTML development has come a long way since its beginnings. Website design and development, has taken many forms. Most developers take great pains to make sure their webpages are indexed by search engines. However, meta robots tags were created a decade ago to allow web authors to prevent search engines from indexing their web pages. These Meta robots tags have grown in functionality over the years and can now do other things in addition to just preventing page indexing.

Using Meta robots tags

In this tutorial, you are going to learn how to use Meta robot tags. We assume you have a working knowledge of HTML basics. If not, please first try this beginners HTML coding course. Please keep in mind that these tags should only be used if you don’t want search engines to index your webpage. This means that if you want your webpage(s) to appear on search engine results, don’t use these tags. Why do you need robot tags then? You may want to keep your website private (a personal website with pictures, for example). Additionally, Meta robots tags can also do other things than just prevent indexing, which we will see in detail later.

Please note that if you want to stop multiple webpages from appearing in search engine results, and not a single page, you might want to use REP (robots exclusion protocol) or the robots.txt file. Meta robots tags are most useful for blocking single webpages. If you are using both REP and Meta robots tags, the search engine will follow the more restrictive of the two. For example, if your REP blocks a search engine from accessing your webpage completely, it will not read any Meta tags you may have written at all. If your REP does not block a search engine from accessing your webpage but a Meta robots tag prevents it from being archived, the search engine will first crawl (access) your webpage, read the Meta robots tag and then index the webpage, but it will not archive the webpage.

We’re assuming you’re familiar with the basics of HTML (or XHTML), so we won’t be discussing the code below in depth. If you’re not familiar with HTML, you can pick up the basics of the language with just a little study.

The syntax for a Meta robots tag is:

 “ >

In HTML, you would find a robots tag at the beginning of the code, written inside thetag as follows:

 “ >
(…)
(…)

If you wanted to place multiple robots tags in your HTML code, you can do it in two ways. The first way is as follows:

 “ > “ >
(…)
(…)

Just type the robots tag two times inside within theparameter.  Google (and other major search engines) doesn’t require you to type multiple robots tags if you separate two distinct robot tag content values with a comma, like this:

,  “ >
(…)
(…)

Robots Tag Content Values

You can add content values to give your robots tags a functionality of your choice. Here are some content values that will work with most major search engines like Google, Yahoo, MSN and Ask:

noindex: This is one of the most used robots tags. It prevents your webpage from being indexed (appearing in search engine results). The syntax for this is:

<meta name= “robots” value=” noindex “ >

nofollow: This prevents search engine bots from following a link on your page. This means that if your webpage has multiple links leading to other parts of your website, search engine bots will not be able to access them. The syntax for this is:

<meta name= “robots” value=” nofollow“ >

none: This prevents search engines from indexing a webpage as well as following it. It is a combination command for noindex and nofollow. However, some webpage developers may use this command to block all search engine bots from their webpage. The syntax for this is:

<meta name= “robots” value=” nofollow“ >

noarchive: This prevents search engines from archiving your webpage (preparing a cached copy which can be accessed by users if your website happens to be down). The syntax for this is:

<meta name= “robots” value=” noarchive“ >

nocache: This command is the same as noarchive, but it is used to instruct MSN search to stop archiving your page. The syntax is:

<meta name= “robots” value=” nocache“ >

nosnippet: This prevents search engines from showing a description of your webpage in the search engine results (like Google does for most websites). It also prevents search engines from archiving your webpage.  The syntax for this is:

<meta name= “robots” value=” nosnippet“ >

noydir: This tag prevents a description of your webpage from appearing in the Yahoo directory (or using a description of your webpage in its search results). This tag is directed only at Yahoo search.  The syntax is:

<meta name= “robots” value=” noydir” >

noodp: This prevents search engines form including a description of your webpage that is found in the Open Directory Project. The Open Directory Project is an open service that maintains a list of internet links, along with descriptions of the webpages where these links lead to. The syntax for this is:

<meta name= “robots” value=” noodp“ >

You don’t need to use robot tags that ask a search engine to index your page. Search engines will index your page by default, so the following Meta robots tag is redundant (even if it’s valid):

<meta name= “robots” value=”index“ >

Guidelines for Writing Robots Meta Tags

Meta robots tags are not case sensitive. For example, the noarchive Meta robots tag can be written as follows:

<meta name="ROBOTS" content=”NOARCHIVE">

This is the same as:

<meta name="robots" content=”noarchive">

You can also write it as:.

<meta name="RobotS" content=”noArcHive">

However, because the other parts of the code (meta, name and content) all appear in lowercase, we recommend that you stick with lowercase if you’re in a hurry. Writing a Meta robots tag in uppercase is useful if you want to make your code appear more readable.

Summing up, you can use Meta robots tags to prevent search engines from indexing, archiving and copying your webpage(s). You may want to use this, if you’re coding your own website in HTML and want to keep certain pages private. Though we’ve covered the basics, we recommend you look up an advanced HTML program to make sure you’ve gotten it right.