首页> 外文会议>IEEE/ACM International Conference on Big Data Computing, Applications and Technologies >A Data Indexing Technique to Improve the Search Latency of AND Queries for Large Scale Textual Documents
【24h】

A Data Indexing Technique to Improve the Search Latency of AND Queries for Large Scale Textual Documents

机译:一种提高大规模文本文档的搜索延迟和查询的数据索引技术

获取原文

摘要

Boolean AND queries (BAQ) are one of the most important types of queries used in text searching. In this paper, a graph-based indexing technique is proposed to improve the search latency of BAQ. It shows how a graph structure represented using a hash table can reduce the number of intersections needed for the execution of BAQ. The performance of the proposed technique is compared with one of the most widely used index structures for textual documents called Inverted Index. A detailed performance analysis is performed through prototyping and measurement on a system subjected to a synthetic workload. To get further performance insights, the proposed graph-based indexing technique is also compared with an enterprise-level search engine called Elasticsearch which uses Inverted Index at its core. The analysis shows that the graph-based indexing technique can reduce the latency for executing BAQ significantly in comparison to the other techniques.
机译:布尔和查询(BAQ)是文本搜索中最重要的查询之一。在本文中,提出了一种基于图形的索引技术来改善BAQ的搜索潜伏期。它显示了如何使用哈希表表示的图形结构可以减少执行BAQ所需的交叉点数。将所提出的技术的性能与称为反相索引的文本文档的最广泛使用的索引结构之一进行了比较。通过对合成工作量的系统上的原型化和测量来执行详细的性能分析。为了获得进一步的性能见解,还将所提出的基于图形的索引技术与名为Elasticearch的企业级搜索引擎进行了比较,该企业级搜索引擎在其核心上使用反相索引。该分析表明,与其他技术相比,基于图形的索引技术可以减少显着执行BAQ的延迟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号