...
首页> 外文期刊>Journal of Parallel and Distributed Computing >Accelerating the similarity self-join using the GPU
【24h】

Accelerating the similarity self-join using the GPU

机译:使用GPU加速相似性自联接

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The self-join finds all objects in a dataset within a threshold of each other defined by a similarity metric. As such, the self-join is a fundamental building block for the field of databases and data mining. In low dimensionality, there are several challenges associated with efficiently computing the self-join on the graphics processing unit (GPU). Low dimensional data results in higher data densities, causing a significant number of distance calculations and a large result set, and as dimensionality increases, index searches become increasingly exhaustive. We propose several techniques to optimize the self-join using the GPU that include a GPU-efficient index that employs a bounded search, a batching scheme to accommodate large result sets, and duplicate search removal with low overhead. Furthermore, we propose a performance model that reveals bottlenecks related to the result set size and enables us to choose a batch size that mitigates two sources of performance degradation. Our approach outperforms the state-of-the-art on most scenarios.
机译:自联接可在数据集内的所有对象中找到彼此相似度阈值内的所有对象。因此,自联接是数据库和数据挖掘领域的基本构建块。在低维度中,与有效地计算图形处理单元(GPU)上的自联接相关的挑战很多。低维数据导致较高的数据密度,从而导致大量距离计算和大量结果集,并且随着维数的增加,索引搜索变得越来越穷举。我们提出了几种使用GPU优化自连接的技术,其中包括采用有限搜索的GPU高效索引,可容纳大型结果集的批处理方案以及低开销的重复搜索删除。此外,我们提出了一个性能模型,该模型揭示了与结果集大小相关的瓶颈,并使我们能够选择可减轻两个性能下降源的批处理大小。在大多数情况下,我们的方法都优于最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号