首页> 外文会议>International conference on very large data bases >Efficient Error-tolerant Query Autocompletion
【24h】

Efficient Error-tolerant Query Autocompletion

机译:高效的差错查询自动完成

获取原文

摘要

Query autocompletion is an important feature saving users many keystrokes from typing the entire query. In this paper we study the problem of query autocompletion that tolerates errors in users' input using edit distance constraints. Previous approaches index data strings in a trie, and continuously maintain all the prefixes of data strings whose edit distance from the query are within the threshold. The major inherent problem is that the number of such prefixes is huge for the first few characters of the query and is exponential in the alphabet size. This results in slow query response even if the entire query approximately matches only few prefixes. In this paper, we propose a novel neighborhood generation-based algorithm, IncNGTrie, which can achieve up to two orders of magnitude speedup over existing methods for the error-tolerant query autocompletion problem. Our proposed algorithm only maintains a small set of active nodes, thus saving both space and time to process the query. We also study efficient duplicate removal which is a core problem in fetching query answers. In addition, we propose optimization techniques to reduce our index size, as well as discussions on several extensions to our method. The efficiency of our method is demonstrated against existing methods through extensive experiments on real datasets.
机译:查询Autocompletion是一个重要的功能保存用户许多击键键入整个查询。在本文中,我们研究了使用编​​辑距离约束来容忍用户输入中的错误的查询自动完成问题。以前接近TRIE中的索引数据字符串,并连续地维护从查询的编辑距离的数据字符串的所有前缀在阈值范围内。主要固有问题是,此类前缀的数量对于查询的前几个字符具有巨大的巨大,并且在字母大小中是指数的。即使整个查询大致匹配少量的前缀,这也导致慢查询响应。在本文中,我们提出了一种新颖的基于邻域生成的算法,INGTRIE,可以实现最多两种数量级加速在现有方法中,用于容错查询自动完成问题的现有方法。我们所提出的算法仅维护一小一组有效节点,从而节省空间和时间来处理查询。我们还研究有效的重复删除,这是获取查询答案的核心问题。此外,我们提出了优化技术来减少我们的索引大小,以及对我们方法的几个扩展的讨论。我们的方法的效率是通过对实际数据集的大量实验来证明现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号