In this paper we present an application of latent semantic analysis (LSA) for indexing and retrieval of document images with text. The query is specified as a set of word images and the documents which best match with the query representation in the the latent semantic space are retrieved. We show through extensive experiments on a large database that use of LSA for document images provides improvements in retrieval precision as is the case with electronic text documents.
展开▼