Canonicalization: How to Optimize Your Blog Content & URLs to be Canonical

Every time I hear that word I think of cannons (spelled differently).

[This is part of the The Blogger’s Essential Guide to Search Engine Optimization Series.]

Creating canonical content should be of supreme interest to all bloggers yet more often than not they don’t have a firm idea of what it means and how it impacts their blog with every single word that they write.

In fact, once you begin to master this concept your search engine rankings can literally surge as you receive more hits and pageviews as you direct the search engines to index the proper content instead of making a guessing-game out of your well-deserved blog posts. I know this from personal experience (and it’s a yummy thing when you see more traffic to your blog!).

Wait a moment – what the heck is “canonical content” or “canonicalization“? Great question! It’s not too hard of a concept to understand can be best understood in a few fun pictures:

Get the picture? One thing that has helped me (and others) is to choose one of the above pictures and firmly implant it in your memory as a picture of what canonicalization is all about!

You see, canonicalization is about optimizing your content for search engines in such a way that they are able to quickly determine what the primary source of information for your blog content instead of the duplicates that are created.

In other words, which version of the content is to be prioritized by search engines? This is important because search engines are not interested, at all, in displaying duplicate SERPs to their users as this reduces the effectiveness of their software and service!

A Few Examples:

One of the easiest examples to understand is your own proper domain, that is, the one-in-the-same (yet different) website address – for example:

[cc]http://john.do[/cc]

and

[cc]http://www.tentblogger.com[/cc]

and

[cc]http://john.do/index.php[/cc]

and

[cc]http:/tentblogger.com[/cc]

and

[cc]http://www.tentblogger.com/index.php[/cc]

and

[cc]http:/tentblogger.com/index.php[/cc]

You see, these are the same site from the end user perspective but what about to the search engine? These URLs are actually different! So how do we let search engines know which one is the best or primary?

This might seem funny to you (or a bit odd) and you may have never even thought about it (which is fine) but most web hosts and servers handle this for you but you can prioritize and choose one over the other via some webmaster tools (like Google Webmaster):

Which one?

Naturally you might wonder which one is better and at this point it really doesn’t matter – choose the one that you like better.

An Example from WordPress:

Although the main website address example is one that’s telling and obvious you probably care more about how your blog content gets indexed and how canonicalization applies there – and you’d be on to something important!

Here’s the cold hard truth: WordPress, right out of the box duplicates your content. A lot. Where? Here are just some of the main areas:

  • Index
  • Single Post Layer
  • Category
  • Tags
  • Author
  • Date Archives
  • Pages
  • IDs
  • And more.

Wow, right? What does this look like explicitly? Here’s an example of how WordPress “duplicates” content across multiple different areas:

Above we have the main index page of my blog (the front page) and there you see Cheshire Cat and the blog content below. Google can index this and so we’ll call this Version 1.

Here we have the content shown at the single post layer. We’ll call this Version 2.

And here we have the same content shown at the category level (in “Tips”). We’ll call this Version 3.

So, you can quickly see that just using these three content areas on my blog that Google could interpret that I have 3 copies of the same content in 3 different places!

The challenge is that Google (and the other major search engines) do not like duplicate content and their desire is to show (and index) only the primary source, (often also referred to as the “master source,” “authoritative source,” and “standard source”) if possible.

Why would WordPress natively create a SEO problem? Well, it’s not really WordPress’ fault (and they have done great work on making sure that the system doesn’t create non-canonical URLs by default!) as we use categories, tags, date-based archives, author archives, and more to provide a easy to use and navigate blog architecture. In addition, no one can control how other bloggers or websites link to your blog as well.

So, the challenge is how we use canonicalization to prioritize the content that we want the search engines to prioritize in their indexation efforts!

Here are a few ways in which I do this:

  1. Use the More Tag
  2. I don’t use tags, only categories.
  3. Healthy and proactive use of nofollow in specific content/meta areas.
  4. Strategic use of meta robots and robots.txt.
  5. Use of redirects (301 via server, .htaccess) for duplicate content.
  6. Use of the rel=”canonical” tag.

If you’re interested in more information about the 6th bullet you can take a look at this 20 minute video by Matt Cutts as well as ways to implement it here.

[tentblogger-youtube Cm9onOGTgeM]

There are many other ways (which I share a bit later) but you get the point for now.

The Bottom Line:

The bottom line is that canonicalization matters with your blog and if (and when) you do it right you can see significant search engine traffic as a result. I know that by optimizing even slightly some of my blog properties I’ve seen upwards of 20% traffic increase!

Why? Because I’m helping search engines clarify (and do the work for them) what the right results should be which is what search engines are all about.

As you optimize your blog and define the authoritative source for your blog content you will not only preserve PageRank (in Google’s perspective) but increase it as it’s not passed on to duplicates! There no longer will be internal competition for value and passing of value in your own blog!

This is a cursory overview but should give you a good idea of what it means and what some basic steps should be done to optimize your blog and make your content always respect canonicalization!

[This is part of the The Blogger’s Essential Guide to Search Engine Optimization Series. Image via Creative Commons, dune.]

Published by

John

Hacker. Human.