首页> 外文会议>VLDB 2011;International conference on very large data bases >Efficient Parallel Lists Intersection and Index Compression Algorithms using Graphics Processing Units
【24h】

Efficient Parallel Lists Intersection and Index Compression Algorithms using Graphics Processing Units

机译:使用图形处理单元的高效并行列表相交和索引压缩算法

获取原文

摘要

Major web search engines answer thousands of queries per second requesting information about billions of web pages. The data sizes and query loads are growing at an exponential rate. To manage the heavy workload, we consider techniques for utilizing a Graphics Processing Unit (GPU). We investigate new approaches to improve two important operations of search engines - lists intersection and index compression.For lists intersection, we develop techniques for efficient implementation of the binary search algorithm for parallel computation. We inspect some representative real-world datasets and find that a sufficiently long inverted list has an overall linear rate of increase. Based on this observation, we propose Linear Regression and Hash Segmentation techniques for contracting the search range. For index compression, the traditional d-gap based compression schemata are not well-suited for parallel computation, so we propose a Linear Regression Compression schema which has an inherent parallel structure. We further discuss how to efficiently intersect the compressed lists on a GPU. Our experimental results show significant improvements in the query processing throughput on several datasets.
机译:主要的网络搜索引擎每秒回答数千个查询,以请求有关数十亿个网页的信息。数据大小和查询负载正以指数速度增长。为了管理繁重的工作量,我们考虑了利用图形处理单元(GPU)的技术。我们研究了改善搜索引擎两个重要操作的新方法-列表交集和索引压缩。对于列表交集,我们开发了有效实现二进制搜索算法以进行并行计算的技术。我们检查了一些代表性的现实世界数据集,发现足够长的反向列表具有总体线性增加率。基于此观察,我们提出了线性回归和哈希分割技术来缩小搜索范围。对于索引压缩,传统的基于d-gap的压缩方案不适用于并行计算,因此我们提出了一种具有固有并行结构的线性回归压缩方案。我们将进一步讨论如何有效地将GPU上的压缩列表相交。我们的实验结果表明,在多个数据集上的查询处理吞吐量有了显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号