Given the user's input as a query, auto-completion selects the top-k strings with the highest scores from the strings matching the query in a dictionary. A recent study [14] proposed space-efficient data structures for top-k prefix matching for auto-completion. In practical applications, however, top-k substring matching is required for many purposes. In this paper, we present a novel approach to solve the top-k substring matching problem. We combined two trie-based data structures derived from the same dictionary for prefix and key search, and we search them alternately leveraging the implicit tree structure shared by them. Experimental results show that our algorithm can suggest top-k completion sufficiently fast, while taking much less space than a compressed full-text index of the dictionary.
展开▼