seo

Speculation on How Search Engines Apply Temporal Data

Speculation on How Search Engines Apply Temporal Data

In my opinion, temporal based data is one of the most valuable kinds of information that feeds into a modern search engine’s algorithm. With it, search engines know when terms are getting more popular, when new terms are being used by searchers and when new forms of an old term are being used to refer to new concepts (and thus need to pull up new search results).

For example, a recent Saturday Night Live sketch featured Chris Parnell and Andy Samberg rapping about the Chronicles of Narnia. The video was an instant hit online, with thousands of media outlets, big and small, covering the phenomenon of folks using the new terminology – “the chronic,”-“what?”-“cles of narnia.”


(From the Video) Andy & Chris Use Yahoo! Maps to Find a “Chronic” Theatre

Searches performed for “chronic narnia” and “chronic what narnia” several weeks ago (before the video became mainstream) showed poor results, with little recognition that the phenomenon existed (Google and Yahoo! both had “did you mean?” suggestions). Today, however, those same searches reveal popular blogs discussing the video and sites hosting copies (many without permission) in the SERPs. Not only have the search engines been quick to display relevant and accurate results, they’ve also recognized the difference between users seeking the mainstream film (and using the full terminology) vs. those who want the SNL video (and use the appropriate nomenclature). There’s even crossover in the search results, with the youtube post appearing top 10 in both searches.

This kind of rise in popularity of searches, appearances in newly spidered pages and on esatblished authority sites was recently covered in a detailed paper by Dr. Garcia – Temporal Co-Occurrence: How does a Developing Event Affects Search Results?

Dr. Garcia’s tracking of Hurricane Rita results is thorough and shows a clear pattern (although Dr. Garcia himself seems hesistant to proclaim it as fact):

Note that websphere curves peak within the noisy 5-to-15 day interval while blogosphere curves level as in Figure 1 and 2. A quick comparison between Figures 2, 3 and 4 reveals that neither the n12 (FINDALL) nor the co-occurrence curves (c-index and Salton Index curves) were able to discriminate the noise and trends. The EF-Ratio outperformed all these curves. It is evident from the EF-Ratio curve that after day 5 the fraction of documents that did not target the query sequence decreased; i.e., an increasing fraction of the n12 answer set deteriorated after this day.

Quick translation – Google’s blog search engine found a very high new number of pages dealing with the topic between 5-15 days after the phenomenon started to get press, then a slow decline (see the curve on the report page).

There are a myriad of applications for temporal monitoring systems and it’s even possible that a high number of new searches might help to keep a site from getting “boxed” or help to mark it as “Internet phenomenon.” With so much data feeding into a search engine and so many results becoming time-sensitive, we can be relatively certain that temporal data has many years ahead of it.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button