What do Dan Thies, Aaron Wall, Dave Naylor, Dave Davis, and select others (*cough Hamlet Batista*) have in common? They think like search engineers.
Dan Thies wrote an excellent piece at SEO Fast Start on the supplemental index (imitated in simpler langugage here, with the addition of sources for getting links with which to pull out of the Supplemental Index). Elsewhere on his site he explained why the index came into being: to save computational cycles. (Note: Google has since done away with the supplemental index, but Dan’s writing is illustrative of the mentality I’m trying to point you to.)Β
Aaron Wall explains the search engines and their perspective in his SEO Book. He puts it, succinctly and intelligently, as focusing on using fewer computing cycles (e.g., saving money on electricity and computer/server resources) and making maximal money (read: sell as much advertising as possible). While there are obviously exceptions, it does capture the mindset fairly accurately.
DaveN‘s been consulting with the SEs (you going all whitehat now, Dave?) and did some videos with Rand that should be pay-per-view (OK, please don’t make them PPV!) because they exposed such terrific analytical thinking. (Pay-per-view – now there’s a model for monetizing YouTube! Seriously!) One example amongst many that had me grinning from ear-to-ear like a kid in a candy shop was his point about normalizing link analysis for linkbait link spikes (which would likely have killed Oatmeal’s chances at ranking for online dating keywords if this were already in place).
Dave Davis (whose Red Fly Marketing just released an excellent Firefox extension for local search marketing uses) pointed out in a comment at Hamlet Batista’s site that you can see how targeted your landing page is in AdWords’ eyes by dropping an AdSense block on it.
Β
Some ideas to get you started thinking like an engineer:
- To make your links more natural, throw in some “click here” anchor texts (or combine with keywords). Use stop words in the anchors, as these will probably be ignored. Have a look at SEO Book.com, for example: The first link in his content has the anchor text: “The SEO Book.” So I can link to Pawsites Online’s pagesΒ with the text this dog page here and to SEOmoz’s client’s lending page (not to be confused with a landing page π ).
- Consider Wiep’s link value factors article. You’ll see some of the comments on the ‘text surrounding a link’ factor. If you use “click here” anchor text, you can make the surrounding text work for you: click here to get your insurance quote. And you’d be doing a good job, too! From the Anatomy paper by Brin and Page (emphasis mine):
“First, it has location information for all hits and so it makes extensive use of proximity in search.Β
“Each document is converted into a set of word occurrences called hits. The hits record the word, position in document, an approximation of font size, and capitalization. Every hitlist includes position, font, and capitalization information.
“Hits occurring close together in a document are weighted higher than hits occurring far apart. The hits from the multiple hit lists are matched up so that nearby hits are matched together. For every matched set of hits, a proximity is computed. The proximity is based on how far apart the hits are in the document (or anchor) but is classified into 10 different value ‘bins’ ranging from a phrase match to ‘not even close’.”
In plain English, this means that Google looks at the text around a word to get an idea of what context it tends to occur in. Which is why when you type in Sante (looking for French language health information), you get suggestions about Santa Fe and so on.
(And if you want to think like a grey/blackhat, that means you find typos and ambiguities like these that the search engines have trouble with. You build a powerful site/network on the easier meaning of the keywords/niche to later boost your other site.)
- “Both the URLserver and the crawlers are implemented in Python.” So, if you want to crush the spider under your heel, consider what can/not be done in Python (assuming that Python is still Google’s main language). The tools you use directly affect the results you can get.Β
- Similarly: “Most of Google is implemented in C or C++ for efficiency and can run in either Solaris or Linux.” If you know of differences between Linux and Microsoft for server efficiency, you can take a guess at how the code might look for specific purposes. It might also help you figure out how Google managed to get rid of the supplemental index.Β
- Rand pointed out in the above WBF with Dave N that CTR data can be important. But this is open to manipulation by automated bots, as Rand commented to Vanessa (and noted by yours truly regarding Claria’s RelevancyRank). Here’s more coverage by the talented SEO known as Slingshot.Β
- Bounce rates are equally important. So? So consider testing DaveN-style cloaked instant redirects to break the referral chain back to the SE. That is, set up your cloaking to put the intermediary page in the way of someone trying to go back to the engine, and have the intermediary be a SERP result – but with your adsense on it! (Dave, you’re so devious it’s genuinely entertaining :D!) Naturally, be wary of using G analytics on your site if you’re doing this.
- With regards to conversion data, be wary of using Google analytics (or any engine’s analytics, for that matter) to track your conversions. As Andrew Goodman wrote in his Winning Results with Google Adwords (highly recommended reading, by the way), giving the engines your conversion data is like sharing your sales volume with the landlord.
If you’re doing well, he can easily charge you more rent (increase bid minimums). If you’re doing poorly, he might not want your performance affecting the rest of the mall (i.e., people’s willingness to click on AdWords) and therefore give you the boot.
Moreover, given the bidding system for keywords, sharing your analytics with the SE means that your competition will be fed your best keywords the next time they use the SE’s keyword research tools. After all, the engines want advertisers to succeed so that they’ll keep advertising. Feeding them successful keywords drives prices up, as happened in my PPC management case study.Β Which leads us to the fact that…
- Search engines want to maximize their auction revenues. And engineers want this to happen so that they get nice raises every year. Showing keyword data helps them get people bidding more on the short tail. But people are lazy and ignore KWs not in the tools or with low volume.Β
- Expect the SEs to share your KWs with competitors, especially if you guys have similar KWs in other campaign ad groups. Split up your campaigns across different accounts (QS won’t matter so much with no one bidding against you on the longtail) and across campaigns. Also split them up across different domains (using minisites and or redirects) to defeat the likes of Spyfu and KeyCompete.Β
- Consider working out arrangements with competitors to split short-tail KW advertising opportunities in time to reduce click costs. It may not work in every market, but with fewer bidders in the auction at any one time, your CPC should drop. It shouldn’t run afoul of pro-competition legislation (anti-trust) because you’re not colluding to fix prices or not compete at all … just to split the weeks/times of day/short tail KWs where you advertise.Β Β
- The engines could use a supplemental index type idea in terms of what they rank. Search engines must necessarily follow users. That’s the whole premise of PageRank: the likelihood of a random surfer to visit any particular page (which is why AdWords should pass PR: they increase the likelihood that someone will visit a page). Users only look at the first page, sometimes the second.
- The SEs could save computer cycles by only working to return a ranking of the 10 most relevant pages, leaving the rest to do when someone actually clicks to page 2, 3, etc. Kinda like cleaning up your lobby and dining room when guests come over but leaving the bedroom mess as is (unless you’re expecting the guests to end up in bed with you, which is another matter…)Β
- As far as naming stuff, Googlers at least like to go their roots: http://en.wikipedia.org/wiki/Googolplex. And they do like their irony and hypocrisy: Though the big engine plays dumb when people type in SEO, there’s no “Did you mean “Googol”?” when people type Google into the search bar.Β
- When asking questions at a conference, don’t try to get the engineers to reveal parts of the algorithm. Ask for specific actions you might take on your site (but phrase the question broadly so the lesson can work for others, too). Then replicate those actions with test sites on made up KWs (“omfhfdsawtfoohlaled” and its stemmed version: “omfhfdsawtfoohlala”), and see what effects the actions have. You can likely draw useful inferences about the algo as a result. See my buddy (and smart fellow) X for more on running SEO experiments.
Please share your ideas for thinking like a search engineer in the comments, and while you’re at it, subscribe to the YOUmoz RSS feed. (And if you’re a search engineer whose name rhymes with Sam’s Clubs, you’re wanted here and here. Ditto the other clever thinkers mentioned above, regarding the first here.)