As my previous posts and articles have tended to be too technical, going forward I will simply try to provide the pieces that are relevant for the discussion, and point to the sources for learning more.
After carefully reflecting on the article that proves Google is using behavioral data in the search rankings, I did some digging and came up with some slightly different conclusions.
First, let me state that I do think they use all the data they collect (or will collect) from search query logs, Google Analytics, Google Adsense, Google Toolbar, browser extensions, Doubleclick, FeedBurner, etc. to improve both their ranking algorithms and ads targeting technology. That is the reason, in my opinion, they offer all of these tools for free. The data they collect is far more valuable. It is so valuable that Ask is even considering selling this data.
WebGeek correctly quotes Google’s official blog:
(Let me give you a little background. A few days earlier, Visio posted his reaction to something he read in Google’s Official Blog proving that they use behavioral data in rankings:)
“Similarly, with logs, we can improve our search results: if we know that people are clicking on the #1 result we’re doing something right, and if they’re hitting next page or reformulating their query, we’re doing something wrong. The ability of a search company to continue to improve its services is essential, and represents a normal and expected use of such data.”
I am really glad to start seeing efforts such as Visio’s. I hope there will be a lot more to come. We should all encourage other SEOs to perform experiments, do research, and publish their findings for peers to review. These are great examples of what we can learn from the scientific community. That’s why they have all the credibility they have.
Although I haven’t been an active participant in the SEO community, I’ve been reading several blogs for years and I feel great respect for all of the experts. One thing I’ve always wanted to see, is more attempts to provide more facts (research papers, patents, etc.) and experiments to back up all of the claims. Search engines are black boxes, and giving advice based on opinions is inevitable and necessary. However, one of the most difficult things for people trying to learn SEO is all of the contradictory information they find online about the same topic. One suggestion I would like to propose is the creation of a site where we try to put together all of the SEO insights, but backed with sources (papers, patents, experiments, etc.). We can use an open source license for the content. The idea is to use this as a reference where we can link to and prove our points.
Now, let me explain my different conclusion about Visio’s findings.
I think Google and other search engines use behavioral data for relevance feedback.What is relevance feedback? From Wikipedia:
Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or “pseudo” feedback.
Relevance feedback is simply considering the input of the actual searchers to determine if the ranking formulas are producing the best results. There are different ways to collect this information, and based on this can be: explicit, implicit, and blind. These lecture notes provide a nice explanation of the process. Please read them for a more detailed look at the topic. It is very interesting.
Please note that relevance feedback is used to tweak the parameters of the ranking formulas, not as an additional factor in the equations.
Does Google use it?
From the paper describing Google’s original search engine Future Work section:
…However, other features are just starting to be explored such as relevance feedback and clustering…
Google’s well known use of quality raters for improving their search results is a clear confirmation that Google already uses relevance feedback on their systems. This type of relevance feedback is explicit feedback. Trusted searchers are presented with a different set of search results for the same query and they select those, that based on their judgment, include the most relevant results. It usually takes several iterations to get the results right.
Now, to the really interesting part. The implicit feedback attempts to infer user search intent by observing user behavior. This is carefully documented in the lecture notes.This is an excellent use for Google Analytics (and other properties) intelligence data.
Think about this. They have so much information on us and on our sites that they can get pretty close to what we are thinking.
Bounce rates, repeat visits, retention, etc. are the best indication of whether a search result was good or not.
Now, what is the difference between my conclusions and Visio’s:
1. I don’t think the implicit feedback information is being used as a factor in the ranking formula. I think they use aggregate information to tweak the equation variables. When too many queries are not giving the best results, then they may alter the ranking formula.
2. I don’t think directly clicking on results will have a direct effect on the ranking formula for the non-personalized Google ranker. This would be very risky for them to do, because it will leave the door open for manipulation.
3. I don’t think a few websites’ behavior information will have a drastic impact on the rankings. Changes to the ranking formula affect many, many sites.
Mr. Singhal often doesn’t rush to fix everything he hears about, because each change can affect the rankings of many sites. “You can’t just react on the first complaint,” he says. “You let things simmer.”
I’ve been blogging for close to three weeks and I have to admit that I am really enjoying it. I did not expect this to be so addictive. I want to thank Rand and the SEOmoz team for giving me the great opportunity of sharing my thoughts via the comments and the Youmoz posts, the SEO community for referencing my posts, and for the excellent feedback I’ve been receiving. You guys rock!