In a move reminiscent of the eighties Pepsi Taste Tests, Bing (UK) has released a survey claiming users have picked their search engine above Google’s in a blind side-by-side comparison of equivalent SERPs. I got this email in my inbox recently:
We’re told that, not only are Bing’s search results more popular than Google’s, but also that we can test it ourselves at www.bingiton.com
11 Reasons the BingItOn Survey makes no sense
Now, like most people in the search industry who don’t work for a search engine, I think strong competition in the market is not just healthy, but essential. Also, I think most people who know the search industry will be thinking that something seems not quite right about that claim, so let’s take a look into this claim and see what we can turn up. I am quite prepared to be ‘pleasantly surprised.’
1. Bing’s comparison leaves out the areas of the SERP where Google outperforms them most.
The first thing that sets alarm bells ringing for me is this small print at the bottom of the email:
So the comparison search results they used were:
“Based on an unbranded comparison of the web search results pane.”
Well, that sounds fair enough, a big brand logo would clearly bias people’s opinion. However, the test also:
“excludes all ads.”
This may be fair, and if we’re comparing just organic search, it would clearly be odd to include them, but it would be hard to argue that the ads presented on a SERP aren’t part of the overall search experience inasmuch as they might make one set better or worse than another. What’s more the survey also excluded Google’s Knowledge Graph, which does seem a little like asking people to choose which tin of baked beans they prefer after all the sauce has been removed.
Compare these two sets of search results – I practically dare you to tell me you don’t genuinely prefer the Google set.
(For any younger folks reading this, a CD is an obsolete form of storage for musical data that was still popular when Razorlight was.)
Well, it’s understandable, though, that we might want to compare similarly structured results, and indeed Bing does say:
“To keep it an apples-to-apples comparison of just the algorithmic web search results, any ads or other features were also removed.”
We may come back to this later…
2. The survey claims are based on a very limited subset of searches.
The small print also states that the survey results are based on the most popular searches and links to a site ( http://www.bing.com/blogs/site_blogs/b/uk/archive/2013/10/10/biouk.aspx ) where we can find out a bit more.
Bing says (all my Bing quotes are from the above link or the Bing it On site itself)
“Participants were each asked to do 10 searches, drawn from a list of 450 from the Google UK Zeitgeist 2012 list and a Google trends list from June 2013. This way, we could test searches that most people were likely to have heard of (because they are popular).”
Bing is telling us that these searches are popular, and therefore typical of the kinds of searches people do most often.
Now superficially that might sound fair enough, but of course we know that the great volume of search is in the long tail — the obscure things that interest us, the combinations of popular terms with locations and modifying adjectives, and those weird and wonderful searches done each and every day for the first and only time.
So a popular search is not necessarily representative of what people are actually searching for, and again, the long tail is one of the areas where, personally, I think Bing struggles most.
3. The survey claims are based on a very weird subset of searches.
Now these 450 ‘popular’ searches were taken from where, again?
“the Google UK Zeitgeist 2012 list and a Google trends list from June 2013″
So lists which are popular, yes, but not necessarily the most popular. In fact, these two lists give greatest prominence to trending topics and spikes in traffic for previously lower-volume search terms.
In terms of what this means for Bing’s survey, we have some more issues here, because clearly these trending topics and short term spikes have a tendency to be news content, sports content, social media and viral activity and other things like that, where the ‘ten blue links’ type of search is weakest, and searchers are likely to be looking for the knowledge graph or onebox-type result which gives you an instant answer to your informational need.
And the second issue here is that these searches, which were very relevant at some point in the past, don’t have the same meaning or urgency that they did when they spiked. To take the top sports result from Zeitgeist 2012 as an example, someone searching for ‘synchronized swimming’ today is likely looking for a different type of result to someone making that same search in summer 2012, while the Olympics were in full swing.
This raises another question. Are people rating these results based on what they thought would have been reasonable results back when ‘Hurricane Sandy’ or ‘Olympic torch route’ were newsworthy searches, or more fact-filled reference articles that might be a better answer for those queries today? We aren’t told this.
Let’s Bing It On, then
So it’s probably time to actually have a go at Bing’s Challenge.
The Bing It On homepage offers the bold statement that:
But as we have seen
4. The searches compared aren’t the most popular searches.
They were just the most trendingest searches some months back. And yes, that’s totally a word.
Time for a search then. First up, a vanity search, I’ll try my Twitter handle:
5. The results sets don’t actually look the same at all.
Straight away, we can see that there are differences in layout between the two sides. The typefaces used are different and the way they display the blue link for the same site varies (i.e. one of them isn’t necessarily using the
Given this, how can we be sure that participants in Bing’s survey didn’t think they were being asked to pick the best looking results? We’re told that the survey takers were asked to choose
“which set of results was the best.”
Perhaps they were given further instructions to choose the most relevant, but if they were, we certainly aren’t. Bingiton.com just asks us to ‘choose the results’ on the left or right.
6. You can easily tell which search engine is which.
Now of course, if you’re reading this you use search engines a lot and you probably spotted the Bing and Google SERP typographic styles immediately. But would the man or woman on the street pick up on this?
Well people in general use search a lot, and we can reasonably suppose that the majority of survey participants do too, as Bing tell us that
“to make sure they were all reasonably familiar with how to use search, they all were required to have used a major search engine in the past month.”
Now regardless of whether they consciously picked one over the other, and regardless of whether they actually did prefer the Bing results, we were told
“users were ‘blind’ to what search engine the results came from,”
But this is clearly not true, and not being a true ‘blind test’ rather undermines any conclusions drawn.
Anyway, I choose the results on the right, on the grounds that the profiles all belong to me. Although Bing’s Left’s results have some nice Twitter follower data, they also have three links to my good friend and colleague, Nick Curtis. That’s a bit odd.
Now for a term I have something of a professional interest in.
Car insurance searches are some of the most competitive in search, and even niche areas attract a lot of attention from very professional SEO types. But we have some more problems here for Bing’s claims.
7. They haven’t removed all of the adverts.
Unbelievably, sat atop Google’s results is their huge self-promoting ad for their own-comparison service. We know it’s an ad, because it says ‘sponsored’ and apparently, Bing can’t spot that. Given that this type of ad appears for other financial searches like ‘credit cards,’ ‘loans’ and so on, people doing this challenge are potentially still seeing a lot of ads. Perhaps folk who prefer ad-free results have picked Bing for that reason.
I haven’t though, because sat proudly at the summit of Bing’s SERP is a low-quality (IMNSHO) exact-match(ish) domain which is of such superlative technical accomplishment that it won’t even load for me at the time I’m writing. Google wins Round TWo.
Now a location search. To whom would I go for marketing advice in the lovely town of King’s Lynn?
Well, on the plus side, they’ve both agreed on the correct answer… (ahem), but,
8. They haven’t removed all the ‘other features.’
Remember Bing said that
“To keep it an apples-to-apples comparison of just the algorithmic web search results, any ads or other features were also removed.”
But, right in the middle of this result, we have a Bing Maps box (which has an, err, interesting selection of businesses). Again, this seems like something that might give Bing a bit of an advantage with members of the public.
Not here, though, I have to choose Google’s left-hand set on the grounds that there are no taxi or mobility scooter sites in their results.
Search four. Bing has been asking me to choose searches that mean something to me, so let’s have a look for my son’s blog about buses.
If Bing takes this survey to mean they are the best at organic search, they are distracting themselves from the fact that their algorithm has some very real issues.
9. Some of Bing’s results are just completely irrelevant.
Now, I know for a fact that Dominic hasn’t done any SEO on his blog, but I’m also fairly sure there aren’t many people called Dominic who are writing a blog all about buses. It’s a pretty obscure search, granted, but one that has a definite ‘right’ answer. Given that, the results on the right are fairly mystifying, given that Google have managed to pick up his Twitter, Facebook and Tumblr. Note that again we aren’t comparing apples with apples. This time, Google’s related searches section has pulled in additional relevant results. And isn’t Bing supposed to have a closer relationship with Twitter and Facebook?
Left wins.
One search left. “Make it count,” says Bing. Well, I’d like to take the family on a holiday soon. In an ideal world I’d rather not take them to a caravan park, but in that ideal world I have a lot more spare money than I actually do in reality:
10. Some of Bing’s results are just too relevant…
I like family caravan holidays as much as the next man (i.e. not a lot) but Bing absolutely loves them! Seriously, look at all of those keyword rich domains and all that bolded text. Google has not been completely immune from the invasion of an EMD, but their results are dominated by well-known brands in this busy, competitive niche. But I wonder, if we were to ask a random member of the public which was the better set of results, if (without clicking) they wouldn’t pick the set that looked like they must be the most relevant. That would be the set with all the boldface type URLs and matching keywords…
Unsurprisingly, perhaps, my results did not go Bing’s way
11. Bing is distracting themselves from the real issues with this survey.
Don’t get me wrong, I am desperate for Bing to offer a compelling alternative to Google and grow their market share substantially, so that we, as members of the search community don’t have all of our eggs in one basket.
It’s also true to say that Bing has made some genuinely great innovations in organic search, and I want them to continue innovating. If they think the engineering job is done on the back of this survey, they are kidding themselves and trying to kid the public. They could try running the survey again with more thoroughly debranded, ad-free, ‘extra’-free results, but the proof of the pudding is in the eating, and even if their organic search results were objectively best, searchers will always judge them on the entirety of the search experience.
The small print on that screen above says:
“Based on an unbranded comparison of the web search results pane only; excludes all ads and Google’s Knowledge Graph. The UK’s most popular web searches are taken from the top algorithmic search results according to Google between December 2012 and June 2013.”
but from what I’ve seen here, pretty much everything in that statement is not quite the full story.
Microsoft’s task is a difficult one, and I certainly don’t have the answers they need, but their energies should be focused on making Bing better, rather than making Bing out to be better than Google.
What do you think? These were genuinely the five searches I tried (although I confess that I swapped the order of the first two for narrative purposes). Are Bing’s results better for you, or were the examples I picked representative?
Editor’s note: after this post was submitted, Freakonomics published a post that challenged Bing It On, and conducted their own tests.