Update: This article discusses how Google determines the snippet links – not how Google determines what sites return snippet links. I hope that clears things up.

There’s been a bit of discussion over these snippet links over on SEM 2.0 and no one really knows how Google is determining these links. I spent some time researching the links and I believe Google is using traffic data to determine these links.

First off a bit of background for those of you that don’t know which Google UI snippets I’m talking about.

For some sites, Google exposes “useful links” from within the site. For example, Google will show extra links from Berkley’s web site for a search on “Berkley”. Matt Cutts has said before that these links are generated algorithmically.

People who know Google well will go “Cool” and move on. Other folks will ask things like “Are sites or their links selected by hand–can my site get in on this? Is money involved?” And the answer is: it’s all algorithmic. The algorithms pick the sites where this could be helpful. Of course money isn’t involved at all.

But how exactly are these snippet links being determined?

First off let’s analyze an interesting example – A search for “adidas” will return to you 4 snippet links: Style – Originals – Performance – Change Location.

This is a good case study because adidas’s homepage only contains 5 links in total and out of those 5 links, 4 are shown as a snippet link. From a user standpoint, you’d be suprised that Google did not return the shop link as a snippet link. That’s because the link is to a page (shopadidas.com) not on the home domain, adidas.com.

Now let’s take a look at the Style link returned by Google. It’s actually a javascript link around an image on the homepage. Google has indexed the link and the alt text.

Interesting, so how is Google picking up the javascript link? They’d have to parse the javascript and that could potentially lead to some security issues. There’s an easier way though – Google could use toolbar data. If you are using the Google Toolbar’s advanced features, when you visit a web page, the toolbar sends Google some information including the url of the page. This data could be used to track linking data that Google’s crawler can not successfully crawl. In other words, with the Google Toolbar, you are Google’s crawler.

Could Google’s toolbar traffic be determining which links they are showing as Google snippet links?

Since you don’t have access to Google’s toolbar traffic data, you have no way of seeing what Google knows about a site’s traffic. But you can compare Alexa traffic to Google snippet links to see if there is a correlation. It’s easier to compare snippet links to Alexa traffic if the domain contains many sub domains, so I took a look at some of those domains to see if there was a pattern. Take a look at this excel sheet to see some domains that resemble Alexa traffic trends (go.com, cnn.com, zdnet.com, netscape.com, w3c).

Best example:



Craigslist’s snippet links are a perfect match to Alexa traffic. There are some discrepancies for sites with multiple “homepages” – for example yahoo, where people may start off at different yahoo properties and thus affecting Alexa traffic data. It’s hard to tell whether or not you traveled to another page from the homepage or you started from that different page – both ways, Alexa counts that as traffic to the page. Most likely, Google is only considering traffic that originates from the homepage.

Google snippet links are most likely determined by traffic patterns. Since Google does not allow access to toolbar traffic data, there is no way to know for sure. Many site’s snippet links closely resemble Alexa traffic stats.

Google snippet links do not return links outside of the home domain.
Google snippet links do not have to be from a text link, it can be an image link or even a javascript link.
Google snippet link text can be determined from an image’s alt text.
Google snippet links can be subdomains of the home domain.
Google snippet links are not determined by PageRank.
Google snippet links are displayed for the top result for a “brand” search or “domain” search. (For example, “zappos” and “zappos shoes”)

