Let D={d_1, d_2, ...d_D} be a given set of D string documents of total length n. Our task is to index D such that the k most relevant documents for an online query pattern P of length p can be retrieved efficiently. There exist linear space data structures of O(n) words for answering such queries in optimal O(p+k) time. In this paper, we describe a compact index of size |CSA|+n lgD+o(n lgD) bits with near optimal time, O(p+k lg? n), for the basic relevance metric term-frequency, where |CSA| is the size (in bits) of a compressed full-text index of D, and lg? n is the iterated logarithm of n.
展开▼