A Krylov subspace method for information retrieval
2004 (English)In: SIAM Journal on Matrix Analysis and Applications, ISSN 0895-4798, E-ISSN 1095-7162, Vol. 26, no 2, 566-582 p.Article in journal (Refereed) Published
A new algorithm for information retrieval is described. It is a vector space method with automatic query expansion. The original user query is projected onto a Krylov subspace generated by the query and the term-document matrix. Each dimension of the Krylov space is generated by a simple vector space search, using first the user query and then new queries generated by the algorithm and orthogonal to the previous query vectors. The new algorithm is closely related to latent semantic indexing (LSI), but it is a local algorithm that works on a new subspace of very low dimension for each query. This makes it faster and more flexible than LSI. No preliminary computation of the singular value decomposition (SVD) is needed, and changes in the data base cause no complication. Numerical tests on both small (Cranfield) and larger (Financial Times data from the TREC collection) data sets are reported. The new algorithm gives better precision at given recall levels than simple vector space and LSI in those cases that have been compared.
Place, publisher, year, edition, pages
2004. Vol. 26, no 2, 566-582 p.
information retrieval, vector space model, query expansion, latent semantic indexing, singular value decomposition, Lanczos algorithm, Krylov subspace
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-47393DOI: 10.1137/S0895479803392261ISI: 000227761100016ScopusID: 2-s2.0-17444375626OAI: oai:DiVA.org:kth-47393DiVA: diva2:455164
QC 201111092011-11-092011-11-082011-11-09Bibliographically approved