SEO for non HTMl documents and Metadata

an article added by: Atila M. at 09172008



In: Categories » » SEO » SEO for non HTMl documents and Metadata

There’s no harm in posting documents on your website in non-HTML formats such as Word, Excel, PDF, or PowerPoint. All of these formats are indexed by the major search engines, and sometimes they rank well. However, good old HTML still has the upper hand in search. Non-HTML content can be a turnoff to searchers, for a couple reasons. Other websites might hesitate to link to non-HTML documents because viewing them may disable the “back” button. Also, many searchers will skip over links to non- HTML documents because they don’t want to wait for a separate program to launch and they may not be in the mood for a long download.

  

Nevertheless, non-HTML content can be optimized and serve you well, especially for the long tail of search. For example, while your home page might rank well for “model cars,” your product PDF could have a better chance of faring well for the term “die-cast model car assembly instructions.” Today, you’ll learn a little bit about what makes non-HTML content work on search engines. Then you’ll make any needed changes to your own docs.

Metadata for Compelling Titles

Search results for non-HTML documents can be downright ugly, because the folks who wrote them never considered how these documents would be presented in the search engines. For example, here is a page of PDF search results.

Look at listings number two and after: trad_frt_p? frt_1_2? What kind of page titles are those? That just isn’t going to cut it in the split-second decision world of search results. Here are possible places that search engines will look for a page title for your document:

• The document title as specified in metadata, which is extra information you write to describe the document (and is stored in a file’s properties but is not visible in the body of the document)

• The first 60 or so characters of the document’s text

• The file name • Any text in the document that you happened to format in a larger font

Search engines will generally look for metadata first, so defining document metadata is the easiest way to improve your listings. In Adobe Acrobat and Microsoft Office applications, metadata such as Title, Author, and Keywords is very easy to define by selecting File > Properties or File > Document Properties. If you are using other programs to author your documents, look to their help pages for guidance.

You can also define a description in the document metadata, but the search engines will generally gather a snippet from the document content anyway.

Content Optimization

Non-HTML documents are basically thrown in the mix with all the other documents and websites in a search engine’s index. So, in addition to inserting metadata as described in the preceding section, you should follow the same SEO guidelines for non-HTML documents as you would for your regular web pages: Include your target keywords in text, link to the document from other pages on your site, make sure URLs in the document are clickable so the search engine robots can follow them, and modify the content for improved snippets if desired.

We know it’s not always realistic for non-HTML content to be edited based on SEO principles. And even if optimized, it’s hard for non-HTML documents to rank well against HTML pages for competitive search terms. You may wish to skip optimizing the document content beyond basic metadata and hope for good results with the long tail of search.

You can get a sense of how search engines see your non-HTML content by viewing the HTML alternate page created by Google.

Next to every search result for a non-HTML document, Google presents a “View as HTML” link. For example, here is the listing for a PowerPoint presentation.

Viewing Yahoo!’s and MSN’s cached version of non-HTML files is a similar experience.

Even if you choose not to spend time optimizing your non-HTML documents, we suggest you review this alternate version. Many of your potential site visitors will look here first before investing their time in a download. Nobody expects these pages to look perfect, but you don’t want them to be an embarrassment to your organization.

When to Remove

You may be surprised to learn that keeping non-HTML documents even if they rank well can create disadvantages for your site. Consider the following:

• Files like PDFs and Microsoft Word documents are stand-alone entities, so they’re not likely to be integrated into your site’s navigation. If visitors click on one of these files directly from a search engine, they may never even look at the rest of your site. You should weigh whether making your non-HTML content available to the search engines is worth the potential loss of traffic to the rest of your site.

• Since non-HTML documents will often be downloaded onto searchers’ hard drives, it’s possible that your content could be used in ways you don’t condone. If you’re concerned about this, don’t put them on your site. At the very least, be sure that every document is clearly marked with authorship information, copyright notice, and your web address.

• Non-HTML documents may contain confidential information hidden in the metadata that you don’t wish to make public, including things like tracked changes, comments, and speaker notes. It’s always a good idea from a security standpoint to review metadata for your documents before posting them in public view. Workshare’s free software, TRACE!, available at www.workshare.com/ products/trace/, can help you weed out potential problems.

With metadata in your pages and content rich with keywords, your non-HTML documents may turn out to be healthy sources of targeted traffic for your site!

legal notice

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

Useful tools and features

SEO for non HTMl documents and Metadata  
If you like this article (tutorial), please link to it from your web page using the information above.

related articles

1. How to determine bloggers to write about my website
Consumer Reviews Online reviews have a lot of influence a frowny icon or a short stack of stars can be all it takes for a potential customer to pass you by. When you think about it, a positive review is not the easiest thing in the world to attain. Most happy customers go their merry ways and keep their feelings of satisfaction to themselves. It’s the disgruntled ones who always seem to find their way back to that “post a review” button. So you’ll need to put some real effort into getting your custom...

2. Images optimization and SEO for Flickr images
Image Optimization We all know that search engines can’t read or understand images. So to rank images, they scout around for clues: text in critical locations tied to the image file and surrounding the picture on the page. To get the image search rankings you desire, be sure to include keywords in these important spots: Image file name For example, if your image is a photo of an oscillating fan, the file name oscillating-fan.jpg says it all! Captions directly beneath or above images F...

3. Blog SEO and the power of blog plugins
Blog Search Last month, you explored whether starting a blog would be a good idea for your organization. Blogs have their own special search engines Technorati, BlogPulse, and Google Blog Search are the biggest players. Today, you’ll learn how to plan and optimize your blog for success on these specialized sites. Basics of Blog Optimization The on-page optimization you’ve already implemented provides a strong start to your search engine presence. But your blog optimization ...

4. How to optimize video files for Google
Video Optimization So, you’re making a feed or hoping for a spider to come find your videos. Or, you’re uploading and looking for some notice from the masses. Here are some places to include your nicely dressed, keyword-rich messaging (most of these tips apply both for videos on your site and for video uploads): On-page text and links to the video First and foremost, make sure all the videos on your site are presented on individual URLs. Text surrounding the video file, and links pointing to it, give ...

5. What Makes Content Linkworthy and Develop New Content
What Makes Content Linkworthy? Everyone is talking about getting inbound links. Some SEOs are even focusing on strategies specifically geared toward building linkable pages, called linkbait. For the best chance of gaining inbound links, content should be • Original • Unique • Useful • Noncommercial (or subtle in its sales pitch) • Timely • Accessible without a password or payment And at the risk of stating the obvious, to be linkable, each page must...

6. Link Building Activities and your websites pagerank
Link-Building Activities Most likely, you’ve already had some correspondence, possibly even several back-andforth e-mail communications, with potential linking sites and blogs. You may have also made directory submittals or explored linking opportunities in the Social Web. Today, review your e-mails and your Link Tracking Worksheet, and briefly summarize these activities. Here are some examples of this kind of commentary: • I contacted 14 bloggers to alert them to our new line of Madras napkins, e-mail...

7. Search Engine Rankings and Listing Quality
Search Engine Rankings For this task, you will perform a manual rankings check on the major search engines for all of your top target keywords. With your before-and-after ranks side by side, it’s easy to see what changes have occurred. If you were starting from zero or you had some easy fixes in your optimization, you may have some exciting improvement in ranks. If you aren’t seeing the improvements you’ve been hoping for, take heart. Read the sidebar “It’s a Marathon, Not a Sprint” for thoughts on SEO timin...