Canonicalization: How to Optimize Your Blog Content & URLs to be Canonical

Every time I hear that word I think of cannons (spelled differently).

[This is part of the The Blogger’s Essential Guide to Search Engine Optimization Series.]

Creating canonical content should be of supreme interest to all bloggers yet more often than not they don’t have a firm idea of what it means and how it impacts their blog with every single word that they write.

In fact, once you begin to master this concept your search engine rankings can literally surge as you receive more hits and pageviews as you direct the search engines to index the proper content instead of making a guessing-game out of your well-deserved blog posts. I know this from personal experience (and it’s a yummy thing when you see more traffic to your blog!).

Wait a moment – what the heck is “canonical content” or “canonicalization“? Great question! It’s not too hard of a concept to understand can be best understood in a few fun pictures:

Get the picture? One thing that has helped me (and others) is to choose one of the above pictures and firmly implant it in your memory as a picture of what canonicalization is all about!

You see, canonicalization is about optimizing your content for search engines in such a way that they are able to quickly determine what the primary source of information for your blog content instead of the duplicates that are created.

In other words, which version of the content is to be prioritized by search engines? This is important because search engines are not interested, at all, in displaying duplicate SERPs to their users as this reduces the effectiveness of their software and service!

A Few Examples:

One of the easiest examples to understand is your own proper domain, that is, the one-in-the-same (yet different) website address – for example:

[cc]http://john.do[/cc]

and

[cc]http://www.tentblogger.com[/cc]

and

[cc]http://john.do/index.php[/cc]

and

[cc]http:/tentblogger.com[/cc]

and

[cc]http://www.tentblogger.com/index.php[/cc]

and

[cc]http:/tentblogger.com/index.php[/cc]

You see, these are the same site from the end user perspective but what about to the search engine? These URLs are actually different! So how do we let search engines know which one is the best or primary?

This might seem funny to you (or a bit odd) and you may have never even thought about it (which is fine) but most web hosts and servers handle this for you but you can prioritize and choose one over the other via some webmaster tools (like Google Webmaster):

Which one?

Naturally you might wonder which one is better and at this point it really doesn’t matter – choose the one that you like better.

An Example from WordPress:

Although the main website address example is one that’s telling and obvious you probably care more about how your blog content gets indexed and how canonicalization applies there – and you’d be on to something important!

Here’s the cold hard truth: WordPress, right out of the box duplicates your content. A lot. Where? Here are just some of the main areas:

  • Index
  • Single Post Layer
  • Category
  • Tags
  • Author
  • Date Archives
  • Pages
  • IDs
  • And more.

Wow, right? What does this look like explicitly? Here’s an example of how WordPress “duplicates” content across multiple different areas:

Above we have the main index page of my blog (the front page) and there you see Cheshire Cat and the blog content below. Google can index this and so we’ll call this Version 1.

Here we have the content shown at the single post layer. We’ll call this Version 2.

And here we have the same content shown at the category level (in “Tips”). We’ll call this Version 3.

So, you can quickly see that just using these three content areas on my blog that Google could interpret that I have 3 copies of the same content in 3 different places!

The challenge is that Google (and the other major search engines) do not like duplicate content and their desire is to show (and index) only the primary source, (often also referred to as the “master source,” “authoritative source,” and “standard source”) if possible.

Why would WordPress natively create a SEO problem? Well, it’s not really WordPress’ fault (and they have done great work on making sure that the system doesn’t create non-canonical URLs by default!) as we use categories, tags, date-based archives, author archives, and more to provide a easy to use and navigate blog architecture. In addition, no one can control how other bloggers or websites link to your blog as well.

So, the challenge is how we use canonicalization to prioritize the content that we want the search engines to prioritize in their indexation efforts!

Here are a few ways in which I do this:

  1. Use the More Tag
  2. I don’t use tags, only categories.
  3. Healthy and proactive use of nofollow in specific content/meta areas.
  4. Strategic use of meta robots and robots.txt.
  5. Use of redirects (301 via server, .htaccess) for duplicate content.
  6. Use of the rel=”canonical” tag.

If you’re interested in more information about the 6th bullet you can take a look at this 20 minute video by Matt Cutts as well as ways to implement it here.

[tentblogger-youtube Cm9onOGTgeM]

There are many other ways (which I share a bit later) but you get the point for now.

The Bottom Line:

The bottom line is that canonicalization matters with your blog and if (and when) you do it right you can see significant search engine traffic as a result. I know that by optimizing even slightly some of my blog properties I’ve seen upwards of 20% traffic increase!

Why? Because I’m helping search engines clarify (and do the work for them) what the right results should be which is what search engines are all about.

As you optimize your blog and define the authoritative source for your blog content you will not only preserve PageRank (in Google’s perspective) but increase it as it’s not passed on to duplicates! There no longer will be internal competition for value and passing of value in your own blog!

This is a cursory overview but should give you a good idea of what it means and what some basic steps should be done to optimize your blog and make your content always respect canonicalization!

[This is part of the The Blogger’s Essential Guide to Search Engine Optimization Series. Image via Creative Commons, dune.]

  • http://Benrwoodard.com Ben

    Wow, what a great introduction.

    It would probably be safe to say that the easier you make it for humans to find your content on your blog or site the harder you make it fir search engines to determine the most important parts. And therefore the more you need to concentrate on canonical efforts.

    • http://john.do John Saddington

      i agree ben. there does need to be a good balance… but blog for people not for search engines!

  • Chris Langille

    Do I smell a “Tentblogger Simple Canonical” plugin??

    I think I do! LOL

    • http://www.speakingoflove.net Sally Brown

      I like this idea!

    • http://john.do John Saddington

      yes. yes you do. ;)

  • http://www.speakingoflove.net Sally Brown

    Hi John,

    Great article and would love to implement this. However, even after you have explained it, I have no idea how to do it. So I guess I’ll wait for the plug-in. Unless you have another post to walk me through the steps for a real beginner! LOL Thanks

    • http://john.do John Saddington

      you’re probably already implementing some of it right now…!

      the goal is to help search engines make sense of your content. i’ll have a plugin or two to help soon.

  • http://www.tommartinatl.com/ Tom Martin

    I agree with Sally, for the novice this runs a little deeper than my understanding. For example I thought by tagging that would actually improve SEO not hinder. Would going back and eliminating tags on my 100+ posts?

    Thanks,

    Tom

    • http://john.do John Saddington

      tom,

      for sure i would. … unless you have a unique strategy around creating that user experience… i would delete them all.

  • Dewitt Robinson

    Great pictures to drive home your point!

    • http://john.do John Saddington

      haha. thanks dewitt. i spend a good deal of time on them.

  • http://www.simpleitaliancooking.com Liz

    I have wondered a long time on what Canonical actually means. Thanks for explaining it with visual aides!

    • http://john.do John Saddington

      sure thin liz! it took me a while to “get it” too.