ABSTRACT
The diversity of a set of search results reflects that set’s coverage of multiple interpretations of the query it is based upon. Interest in search result diversity has increased as search engines struggle to return relevant information for ambiguous queries from an increasingly large data set. We develop a result diversification system which uses probabilistic latent semantic indexing to form clusters that are then used to reorder search results and increase result list diversity scores. We show that adjusting the influence of rank in the reordering algorithms improves their performance by finding a balance between the importance of rank and the importance of generated clusters. Finally, by applying the result diversification system to the known clusters used in judging, we generate reordered lists that estimate upper- bounds of the diversity scores.
Written for Advanced Information Retrieval taught by Marteen de Rijke.
Download the full paper [PDF].
Tuesday, June 8, 2010
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment