Despite the possible duplicate content penalty, I’m going to reproduce (yes, word for word) a post I made earlier today on SEW to try to clear up some confusion about the relationship and meaning of LSI/LSA vs. Theming.
LSA – Latent Semantic Analysis
The idea behind this is that by taking a huge composite (index) of millions of web pages, the search engines can “learn” which words are related and which noun concepts relate to one another.
For example, using LSA, a search engine would recognize that trips to the zoo often include viewing wildlife and animals, possibly as part of a tour.
Now, conduct a search at Google for ~zoo ~trips. Note the bolded words match the terms I italicized in the paragraph above. Google is bolding ‘related’ terms and recognizing which terms that frequently occur concurrently (together / on the same page / in close proximity) in their index.
Some forms of LSA are too computationally expensive. For example, Google isn’t smart enough to ‘learn’ the way some of the newer learning computers do at MIT (see some news reports on this). They cannot, for example, learn through their index that Zebras and Tigers are both examples of striped animals, although they may realize that stripes and zebra are more semanticly connected then ducks and stripes.
Theming
Theming is more of an SEO concocted subject that is floated around often – choosing a ‘themed’ page for a link rather than a non-themed page. Basically, theming is what Google bought the company Kaltix for. They created the site-themed (flavored) search for Google, which is able to categorize many websites, based on their content/links/etc. into varying themes through a categorization structure.