首页> 外文会议>IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing >A Graph-Based Indexing Technique to Enhance the Performance of Boolean AND Queries in Big Data Systems
【24h】

A Graph-Based Indexing Technique to Enhance the Performance of Boolean AND Queries in Big Data Systems

机译:一种基于图的索引技术,可增强大数据系统中布尔和查询的性能

获取原文

摘要

This paper introduces a new graph-based indexing (GBI) technique for big data systems. It uses a directed graph structure that effectively captures the simultaneous occurrence of multiple keywords in the same document. The objective is to use the relationship between the search keywords captured in the graph structure to effectively retrieve all results of Boolean AND queries at once. The performance of the proposed technique is compared with the conventional inverted index-based technique. This paper highlights that, irrespective of the intersection algorithm used to evaluate Boolean AND queries, GBI always returns Boolean AND search results faster than the inverted index. This is due to the fact that GBI always performs a smaller number of intersection operations and avoids intersection if search keywords do not have a common document. A preliminary performance analysis is performed through prototyping and measurement on a system subjected to a synthetic workload. The analysis shows that GBI improves search latency when executing Boolean AND queries by an average of 69% to 99.9% in comparison to the inverted index.
机译:本文介绍了一种用于大数据系统的新的基于图的索引(GBI)技术。它使用有向图结构,可以有效地捕获同一文档中多个关键字的同时出现。目的是利用在图结构中捕获的搜索关键字之间的关系来一次有效地检索布尔AND查询的所有结果。所提出的技术的性能与常规的基于倒排索引的技术进行了比较。本文着重指出,不管用于评估布尔AND查询的交集算法如何,GBI始终比反向索引更快地返回布尔AND搜索结果。这是由于以下事实:GBI始终执行较少的交集操作,并且如果搜索关键字没有公共文档,则避免交集。初步的性能分析是通过对经受综合工作负载的系统进行原型设计和测量来进行的。分析表明,与倒排索引相比,GBI在执行布尔AND查询时的搜索等待时间平均缩短了69%至99.9%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号