...
首页> 外文期刊>The VLDB Journal >Efficient fuzzy full-text type-ahead search
【24h】

Efficient fuzzy full-text type-ahead search

机译:高效的模糊全文预搜索

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Traditional information systems return answers after a user submits a complete query. Users often feel “left in the dark” when they have limited knowledge about the underlying data and have to use a try-and-see approach for finding information. A recent trend of supporting autocomplete in these systems is a first step toward solving this problem. In this paper, we study a new information-access paradigm, called “type-ahead search” in which the system searches the underlying data “on the fly” as the user types in query keywords. It extends autocomplete interfaces by allowing keywords to appear at different places in the underlying data. This framework allows users to explore data as they type, even in the presence of minor errors. We study research challenges in this framework for large amounts of data. Since each keystroke of the user could invoke a query on the backend, we need efficient algorithms to process each query within milliseconds. We develop various incremental-search algorithms for both single-keyword queries and multi-keyword queries, using previously computed and cached results in order to achieve a high interactive speed. We develop novel techniques to support fuzzy search by allowing mismatches between query keywords and answers. We have deployed several real prototypes using these techniques. One of them has been deployed to support type-ahead search on the UC Irvine people directory, which has been used regularly and well received by users due to its friendly interface and high efficiency.
机译:用户提交完整的查询后,传统的信息系统会返回答案。当用户对基础数据的了解有限,并且不得不使用一种尝试式的方法来查找信息时,他们常常会感到“茫然”。在这些系统中支持自动完成功能的最新趋势是解决此问题的第一步。在本文中,我们研究了一种新的信息访问范式,称为“预先输入搜索”,其中,当用户键入查询关键字时,系统“即时”搜索基础数据。它通过允许关键字出现在基础数据的不同位置来扩展自动完成界面。该框架允许用户在键入数据时浏览数据,即使存在微小错误也是如此。我们在此框架中研究大量数据的研究挑战。由于用户的每次击键都可以在后端调用查询,因此我们需要高效的算法来在毫秒内处理每个查询。我们使用先前计算和缓存的结果为单关键字查询和多关键字查询开发了各种增量搜索算法,以实现较高的交互速度。我们开发了新颖的技术,通过允许查询关键字和答案之间的不匹配来支持模糊搜索。我们已经使用这些技术部署了几个真实的原型。其中之一已被部署以支持UC Irvine人员目录上的预先输入搜索,由于其友好的界面和高效率,该目录已被定期使用并受到用户的好评。

著录项

  • 来源
    《The VLDB Journal》 |2011年第4期|p.617-640|共24页
  • 作者

  • 作者单位
  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号