How to make your hot new RIA friendly to search engines

Imagine you’re building a cool new Rich Internet Application with Flex or maybe Silverlight. Now, you want search engines to be able to find that application. How do you do this? What special sauce needs to be added to your application to make it SEO-friendly? Should your SWFs or XAML files include special optimizations? Let’s explore.

Consider the following situation. You’re looking for an online word processor that supports bulleted lists. Seems like a reasonable user need. What is the top search engine result going to be? To be more specific, which of the following pages makes the most sense for the user to find first?

What’s the Best Search Result?

  1. The login screen for Google Docs. There’s a short description of the application, but it doesn’t say anything about bulleted lists on this page.

  2. A list of features available in Google Docs from the official site. This page will tell you that the application supports bulleted lists and provides you ways to get more information about the app.

  3. A review by a third-party that tells you a bit about the features in Google Docs, and compares it to other word processors. Again, this page will tell you that Google Docs supports bulleted lists and it should provide you many links to get more information about the app.

  4. A documentation page describing how to use bulleted lists in Google Docs.

Choice number one is mostly useless as the first result. What does a login screen tell you? It says, “You need to sign up for an account to use this mystery word processor. It may or may not have the feature you need.” Choice two, in my opinion is the best one, since it provides the answer, and it is most likely to provide leads to all sorts of useful information (including ways to sign up and login when you’re ready). The third choice would be a good one too, but Google would probably prefer one of their own pages to be number one. The fourth choice, the documentation page, is reasonable too. It gives you the answer you were looking for, but you might have to click a few links to find the app itself.

I should note that Google Docs is written in HTML and JavaScript, yet it is perfectly relevant to this discussion. In the example above, I searched for “online word processor bulleted lists” and I got several reviews of Google Docs on the first page. Google Docs itself was not in the top results. Interesting, don’t you think?

Let’s go a step further. What if your application doesn’t need authentication? If a complex user interface similar to Microsoft Word suddenly appears in your browser window, maybe with a loading screen or some whiz-bang animations, you’ll have a moment of complete confusion. Where do you start? Again, I say that it’s better for a potential user to see the list of features or some sort of sales pitch at this point. They’ll have to learn how to use this unexplained user interface before they even know that a feature they need is supported. I imagine that’s a poor investment, in many users’ opinions. They’ll probably leave pretty quickly if they don’t find bulleted lists within a couple clicks.

This leads to a closely-related argument. What’s so useful about that user interface to a search engine? Will a button for bulleted lists, hidden among other controls for formatting options, provide enough information to outweigh that third-party review about Google Docs? Unlikely. In fact, you might as well consider it useless for SEO. Ryan Stewart just posted about why Silverlight’s plain-text XAML isn’t more SEO-friendly than Flash’s binary SWF format for the exact same reason. To a search engine, the “semantic” content is what’s important. They don’t care about the presentation layer (is CSS very useful? only to catch people cheating the system). In this case, good semantic content for Google to offer to a crawler would be a page that tells their potential users about Google Docs and why they should use the application. People don’t specifically search for a page that has a button that says “bulleted list”, they want more detailed information.

Exposing Semantic Data

The feature list and documentation aren’t the only semantic data that one might want to expose for an RIA. Perhaps Google wants the documents created with Google Docs to be listed in search results. When a user finds a specific document, it could load directly into Google Docs. In another case, someone running an online directory of engineering consultants might want their listings available to search engines. Maybe they’d rather provide a rich user interface with all sorts of filtering and dynamic viewing options to visitors to help improve their experience using the list. The search engine should help them find the data they need, but the application greatly improves the way visitors can interact with it.

I mentioned the example of a directory because Adobe’s Ted Patrick wanted to create a listing of Flex consulting firms that is available in an application built with Flex. The aptly named Flex Directory is one of the more SEO-friendly applications running on Flash Player that I’ve encountered. If you load the app, and view the HTML page source, you’ll discover something very interesting. It’s the raw XML listing of company information, but there is no HTML declared and no SWF embedded in that data at all! To improve the experience for the search engine crawler, Ted found a way to load the data first, and place Flex Directory on top of it. The search engine will see the plain-text data, while the user will discover the user interface. The following line in the XML file is what makes it all happen.

<?xml-stylesheet type="text/xsl" href="http://directory.onflex.org/template002.xsl"?>

The browser’s capability of manipulating XML through XSL is the key. If you take a look at the XSL file, you’ll discover that Ted replaces the XML content with a very basic HTML page that embeds the Flex SWF file with SWFObject. He passes the page URL through FlashVars to the Flex application, and the app loads the raw XML without XSL manipulations, much like the search engine crawler.

var so = new SWFObject( "http://directory.onflex.org/template002.swf" , "fxtxsl" , "100%" , "100%" , "9" , "#191919");
so.addParam( "scale" , "noscale" );
so.addVariable("xmlurl", document.location );
so.useExpressInstall( "http://directory.onflex.org/expressinstall.swf" );
so.write( 'flexcontent' )

I imagine that it wouldn’t be too difficult to set up a similar method to load data into Silverlight or plain-jane AJAX applications. In all cases, a complex web application that works and looks nothing like a regular HTML document can easily be read and understood by a search engine designed to index those documents. Meanwhile, a user will discover a richer presentation that simple CSS can’t provide.

About Josh Tynjala

Josh Tynjala is a frontend developer, open source contributor, bowler hat enthusiast, and karaoke addict. You might be familiar with his project, Feathers UI, an open source user interface library for Starling Framework that is included in the Adobe Gaming SDK.

Discussion

  1. Pingback: MS MossyBlog : RIA and Search Engines.

  2. Pingback: RIA and Search Engines. - Noticias externas

  3. Doug McCune

    Before jumping on the XML/XSLT bandwagon, have you done a Google search for “flex directory”? Ted’s flex app does in fact appear, but with no information other than the page title and URL. A link to flexdirectory.com is first (I would expect that), the Flex Directory app comes second, and Ted’s blog post about the flex app comes third. Now, I am a bit surprised that the flex app itself comes before Ted’s blog post, but I assume that’s because the words flex and directory are both in the URL of the flex app, and only flex appears in the URL of Ted’s blog. So the searcher gets no other information about the Flex Directory app other than that it’s a page call “flex directory” located at http://directory.onflex.org. It might be helpful if Google showed a description of the contents. Since the contents are just XML listings of companies I would have at least hoped that some of those company names would show up in the description.

    If I do a search for “flex consulting firms” or “flex consulting directory” then the top results are all blog posts about the Flex Directory app. So I think this example shows that doing an exact google search for terms in a URL and page title will get you that result, but doing a search for the actual content won’t. I question whether putting the XML data as the meat of the page actually helps the search rank at all. It certainly doesn’t help the user experience when looking at search results.

  4. Josh Tynjala

    Doug, you bring up a good point, and I agree that the XML/XSL may not be perfect… yet. Google does focus quite a bit on the structure of HTML, and I just checked for “flex directory” on Yahoo, and Ted’s app doesn’t show up at all in the first ten results. Unfortunate.

    However, I still believe that this method is better than anything else you might try to do to the SWF to make it more friendly to crawling. In time, Google and other search engines will start being a little less HTML-centric. In fact, it’s already happening. PDFs, RSS, and other formats frequently appear in Google results for me. Perhaps with the rise of RIAs, more standardized XML dialects will start being used. It makes sense, and I expect to hear SEO “experts” happily recommend “SaSML for such and such type of content” since Google is optimized for that format.

    Anyway, we’ll see how things go. If Ted wants to make Flex Directory a bit better right now, he might consider turning his XML into RSS, since that seems to be a format currently supported by Google.

  5. Ted Patrick

    Here is where it gets interesting! I have been looking at the sitemap.xml format and thinking that it would be ideal to get data about the Flex Directory defined in the Data there.

    The other element might be to just insert a basic XHTML content into the page and allow the style sheet to overwrite the page with a richer UI.

    Now that the site is running we can tinker to improve the search engine listing. Ideally simply adding an should do the trick as XML.

    I think sitemap may have a few more options though. More to come!

    Ted 🙂

  6. Pingback: Sönke Rohde » Search Engines and RIAs

  7. Pingback: Jason’s Blog » Blog Archive » SEO Optimized Flex

  8. Pingback: Dave Johnson » Blog Archive » Ajax Alive and Kicking

  9. Eric

    re: “The other element might be to just insert a basic XHTML content into the page and allow the style sheet to overwrite the page with a richer UI.”

    This is essentially what space150’s “Faust” technique does for sites like relic.com and theivyhotel.com.

    http://blog.space150.com/2007/1/11/faust-flash-augmenting-standards

    The XHTML forms a base, so the site is browseable without Flash. Then, for users with Flash, that data is pulled in and enhanced.

    For example, on theivyhotel.com, slideshow HTML is interpreted with either a simple CSS view or an animated Flash app, depending on the user’s capabilities. Even browsing it with Javascript turned off is not problematic, with the right CSS skin on top.

    e

  10. Pingback: Using Search Engine Optimization (SEO) In Flash | EVOLVE

  11. Pingback: Philly Site Builder » Blog Archive » Optimize your Flex Application for Search Engines

  12. Pingback: Flex and search engine optimisation « Justin J. Moses : Blog

  13. Pingback: HellFest “Open Air Edition”, part I – An Extreme modular (IoC/ PixIoC) ActionScript RIA at Deja-vue.net

  14. bi dashboards

    We can’t avoid the RIA’s as they are most appealing and intuitive for the users, but the same time can’t avoid searchengines as they are the ones who takes our site to users. keeping this in mind am looking for the optimizaion of some flex dashboards present in my site. this article helped to give a start on it and provided with tips for flash/flex seo.