Engineering Basic Algorithms of an In-Memory Text Search Engine

FREDERIKTRANSIER; PETER SANDERS

首页> 外文期刊>ACM Transactions on Information Systems >Engineering Basic Algorithms of an In-Memory Text Search Engine

【24h】

Engineering Basic Algorithms of an In-Memory Text Search Engine

机译：内存中文本搜索引擎的工程基本算法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Inverted index data structures are the key to fast text search engines. We first investigate one of the predominant operation on inverted indexes, which asks for intersecting two sorted lists of document IDs of different lengths. We explore compression and performance of different inverted list data structures. In particular, we present Lookup, a new data structure that allows intersection in expected time linear in the smaller list. Based on this result, we present the algorithmic core of a full text data base that allows fast Boolean queries, phrase queries, and document reporting using less space than the input text. The system uses a carefully choreographed combination of classical data compression techniques and inverted-index-based search data structures. Our experiments show that inverted indexes are preferable over purely suffix-array-based techniques for in-memory (English) text search engines. A similar system is now running in practice in each core of the distributed data base engine TREX of SAP.

机译：倒排索引数据结构是快速文本搜索引擎的关键。我们首先研究对倒排索引的一项主要操作，该操作要求将两个排序的长度不同的文档ID进行相交。我们探索了不同反向列表数据结构的压缩和性能。特别是，我们提出了Lookup，这是一种新的数据结构，它允许较小列表中的期望时间线性交集。基于此结果，我们介绍了全文数据库的算法核心，该算法数据库允许使用比输入文本小的空间进行快速布尔查询，短语查询和文档报告。该系统使用精心编排的经典数据压缩技术和基于反向索引的搜索数据结构的组合。我们的实验表明，对于内存中（英文）文本搜索引擎，倒排索引优于纯基于后缀数组的技术。现在，在SAP的分布式数据库引擎TREX的每个核心中实际上都在运行一个类似的系统。

著录项

来源
《ACM Transactions on Information Systems》 |2011年第1期|p.2.1-2.37|共37页
作者
FREDERIKTRANSIER; PETER SANDERS;
展开▼
作者单位

Dietmar-Hopp-Allee 16, 69190 Walldorf, Germany;

Universitaet Karlsruhe, Fakultaet fuer In- formatik, ITI Sanders, 76128 Karlsruhe, Germany;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
inverted index; in-memory search engine; randomization;

机译：倒排索引内存中搜索引擎;随机化;

相似文献

外文文献
中文文献
专利

1. A Full-text Website Search Engine Powered by Lucene and The Depth First Search Algorithm [J] . Modinat. A. Mabayoje, O. S. Oni, Olawale S. Adebayo International Journal of Computer Network and Information Security . 2013,第3期

机译：由Lucene提供支持的全文本网站搜索引擎和深度优先搜索算法
2. Construction and usage of full-text search engine system -How to create your web site with full-text search engine system- [J] . Harada Yoichi 情報管理 . 2001,第10期

机译：全文搜索引擎系统的构建和使用-如何使用全文搜索引擎系统创建您的网站-
3. Hybridizing gravitational search algorithm with real coded genetic algorithms for structural engineering design problem [J] . Amarjeet Singh, Kusum Deep Opsearch: Journal of the Operational Research Society of India . 2017,第3期

机译：具有实际编码遗传算法的杂交重力搜索算法，用于结构工程设计问题
4. Intelligent Search Engine algorithms on indexing and searching of text documents using text representation [C] . Minnie D., Srinivasan S. 2011 International Conference on Recent Trends in Information Systems . 2011

机译：使用文本表示法对文本文档建立索引和搜索的智能搜索引擎算法
5. Proactive search: Using outcome-based dynamic nearest-neighbor recommendation algorithms to improve search engine efficacy. [D] . Wagner, Christopher Shaun. 2014

机译：主动搜索：使用基于结果的动态最近邻居推荐算法来提高搜索引擎的效率。
6. Usability evaluation of an experimental text summarization system and three search engines: implications for the reengineering of health care interfaces. [O] . Andre W. Kushniruk, Min-Yem Kan, Kathleen McKeown, 2002

机译：实验性文本摘要系统和三个搜索引擎的可用性评估：对医疗保健界面的重新设计的意义。
7. Usability evaluation of an experimental text summarization system and three search engines: Implications for the reengineering of health care interfaces [O] . Kan Min-yen, McKeown Kathleen, Klavans Judith L., 2002

机译：实验文本摘要系统和三个搜索引擎的可用性评估：对医疗保健界面的重新设计的含义
8. Conflict Management in Collaborative Engineering Design: Basic Research in Fundamental Theory, Modeling Framework, and Computer Support for Collaborative Engineering Activities [R] . Lu, S. C. , Udwadia, F. , Cai, J. , 2002

机译：协同工程设计中的冲突管理：基础理论基础研究，建模框架和协同工程活动的计算机支持

Engineering Basic Algorithms of an In-Memory Text Search Engine

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅