The University of Massachusetts, Amherst houses a project called the Center for Intelligent Information Retrieval. The project is focused on solving many of the IR problems in web search and document collection search. Some of the more fascinating studies they undertake include:
- An advanced, artificial intelligence crawling, classifying and learning software program for categorizing information – CALO
- A news search program, called TDT that is designed to identify the first news story on a particular subject and then classify all others on the same issue.
- A program that extracts non-strcutured text (like that you might find on the web) and identifies the topic and category the text should fall into.
They also have research, downloads (iuncluding betas of some of their software) and a list of publications from people working on the projects. It’s a little academic, but certainly a fun way to get a glimpse into what problems still exist in IR and what the future might hold.