Optimizing Your Blog’s Indexable HTML Content for SEO

[This is part of the The Blogger’s Essential Guide to Search Engine Optimization Series.]

One of the best ways to understand how search engines work is to understand what they can and cannot index – in other words, what they can and cannot see.

We talked about some of the challenges and limitations of search engines in the previous post but now it’s time to focus on what they were built to do: Crawl your awesome content, store the information for future use and retrieval, and then produce the results when someone looks for the content that your specific blog provides.

We’re going to get a bit technical but nothing that should necessarily scare you – in fact, you’ll be amply supplied to learn what is most important for you as a blogger by the time this post (and this series) is complete.

Sure, you’re not a programmer nor a software developer nor have you created the blog application that you so fondly use every single day – but you’re a tried and true practitioner and you can benefit greatly with a little website construction 101.


Blog Pages for Humans and Search Engines:

Becoming a blog architect isn’t as hard as you might think – in fact, simply knowing the basics of how a search engine crawls your site can help you optimally structure it more efficiently, right?

It’s like knowing the rubric, the map so to speak, of how to build that IKEA bookshelf that you bought without thinking:

Instructions make all the difference, right?

By knowing just a bit you can build that shelf faster and more efficiently with the instructions in-hand, right?

Developing an understanding of indexation by search will help you optimally structure your content for search engines – but make sure you’re also doing it (and seriously considering) the structure for people too. In this way you’ll develop a search engine-friendly blog that’ll return you pageviews and hits for as long as your blog lives.

I would like that, wouldn’t you?

Let’s Do Some HTML:

Search engines (all search engines) crawl your content looking for your content in it’s HTML-form, or text format. As we mentioned previously, images, flash, applets, java, iframes, plugins, and more are invisible to search engines and their bots that crawl your content. Sure, the content is still there but it’s not “there” for search engines.

What you need to make sure, then, is that you do all that you can to provide the search engines with content that can be crawled easily in the form of HTML text on each blog page.

There are some advanced ways that you can make those limited elements more open to being crawled:

  1. Images (.gif, .jpg, .png) can be given “alt attributes” via HTML or be replaced by text via CSS styling.
  2. Flash, java, applets, videos, audio, and more can have the content repeated via plain text on the page. In the case of video and audio the use of transcripts is vitally important.

So how do you know what is in HTML format and what is not? You can use a few tools to help you accomplish this very basic test:

1. Google Cache

Click the 'Cached' button.

You can easily see what the search engines see if you google your own blog and click the “Cached” button.

Then click the “Text-only Version” on the right:

And then you’ll see what search engines see:

What do you see when you do a test on your blog? Is everything there? Is there anything that doesn’t look quite right?

This very basic test can very quickly help you not only know what search engines see but also what you can do to optimize your blog architecture in such a way that will help search engines index your content better and more effectively.

2. SEO Browser

SEO Browser can also show you what a search engine sees as well. The “Simple” tool will show you the basics while the “Advanced” tool (after registration) can give you access to a few more results, like so:

More data via Advanced.

The same difference a Google Cache but it does give you another look so it’s worthy of a bookmark.

One neat thing that I discovered just doing this activity is that one of my “more tags” was wrapped in an H3 tag when it shouldn’t have been:


It should look like this:


I quickly adjusted and corrected this. Wow, what a simple (and perfect) example of how this tool helped me catch an error.

But these tools can provide much more information (and strategy) than just correcting a more tag, right? Here was the offending code:


So over the next few blog posts I’ll dive into how we can specifically optimize elements of your blog, blog pages, and blog posts so that it’s can be better indexed by search engines.

