Defeating The Sandbox
Ian McAnerin has been posting about how he’s been able to get around the sandbox with some careful planning and preparation.
First off some background information about Ian’s little experiment. Back in October, Ian registered http://www.mcanerin.us and created a page on that domain aiming to rank for the term “dotus” (I’m not linking to the pages because I don’t want to influence the expirement).
A month later, the experiment page is ranking #4 for “dotus”.
Ian isn’t giving the details behind his findings but we can take a look at Ian’s experiment page and see what he is doing.
First thing I noticed when visiting the page is that there is a 301 redirect from the mcanerin.us domain to his mcanerin.com domain. After a quick whois lookup, I can tell that his mcanerin.com domain has been registered since 2001. A site command at Google shows only one page indexed for the mcanerin.us domain – the experiment page. A query for “dotus” brings up “www.mcanerin.us/us/dotus.htm” at rank #4. What’s interesting is that the page is no longer resolving to the correct page and is now a 404. So what happened?
A couple site/link/cache checks at the other search engines might help. Yahoo is showing about 18 pages indexed for the mcanerin.us domain and MSN is showing about 100. Obviously the mcanerin.us domain (or at least the experiment page) had to have been indexed for it to rank. But with a 301 redirect, Google should have dropped the mcanerin.us domain in favor of the mcanerin.com domain. Loading up anything on the mcanerin.us domain brings you to mcanerin.com/EN/us/. So why did all the search engine index the mcanerin.us domain?
I can think of two reasons:
1. Ian started off using a 302 redirect.
2. There is a delay from when the search engines first pick up the domain and when they resolve the 301 in their indexes.
I think the answer is number one. If you take a look at the pages that were indexed all of them follow the same exact url structure as the mcanerin.com domain except the .com is replaced with .us. Although many of the mcanerin.us pages no longer resolve (check MSN for the list), if you replace the .us with .com – all of them are still there. From a developer’s standpoint, it would be very easy to write a 302 that swaps out the .us with .com.
A 302 redirect will tell the search engines to keep the mcanerin.us domain indexed along with the content that comes from the mcanerin.com domain. This is what we are seeing.
But why is there a 301 redirect on the site now? Was there ever a 302? I can only guess, but I’d say Ian started off with a 302 redirect then switched to a 301 after his experiment was over. Again I may be totally off on this one.
A link check over at Yahoo shows that Ian’s sites are the only site linking to his experiment page. Let’s look at what links to Ian’s sites – seobook, sej, seobythesea, sew, seomoz – all relevant sites and all authority sites. Ian links to other authority sites like high rankings, sew, sempo, sma-na, etc so I’d consider Ian’s site part of the authority hub too.
The experiment page is cleanly coded – xhtml and css for layout. The title of the page is “DotUS Domain Registration” emphasizing “dotus” and using “domain registration” is a good semantically related keyword.
There are a couple mentions of “dotus” early on and in different header tags (h1/h2). What really stands out is the content and use of related keywords. Sprinkled throughout the content, Ian uses related keywords like “dot us”, “domain registration”, “mcanerin.us”, “dotus ccTLD”. Even the alt tags for images contain related keywords. There is no excessive use of the same keywords, since each use is not a copy but a related keyword.
So how does all this information relate to the sandbox?
Ian’s experiment page is a good example of how to approach the sandbox. In my opinion the sandbox has always been about your link patterns – specifically TrustRank. The sandbox is not a filter for new sites but a filter for untrusted sites. Great links from on topic trusted authority sites will pass on “trust” to a site and help that site avoid the sandbox. For example, this site (http://www.socialpatterns.com/) never hit the sandbox. After a couple weeks this site already started to rank. This site typically picks up links from established seo sites at a fairly natural rate. If I write something worthwhile, I’ll gain a couple links. Overall my ratio of trusted links vs untrusted links is higher for trusted links.
Overview of Ian’s experiment page
TrustRank – incoming links from Ian’s main page, which is an ontopic authority.
Page Optimization – headers/titles/semantically related keywords.
Link History – slow growth of links, trusted links outweight number of untrusted links.
Redirects – I may be totally off on this one, but I think Ian started off with a 302 in order to get the domain indexed with the same content. No real duplicate penalty because it’s obvious that the domain is a geographical copy.
Clean – clean code and small size. Xhtml/CSS allows for less code (compared to tables) and still allows designers to craft a good looking page.
Semantic Code – important terms are placed higher in the document and in a higher heading tag. Less important but still related terms are placed in a lower heading tag.
Content – good amount of related content without overstuffing keywords.
Update: I was totally off about the 302. Ian’s posted some more info about his experiment.