...
首页> 外文期刊>Concurrency and computation: practice and experience >Fast parallel skew and prefix-doubling suffix array construction on the GPU
【24h】

Fast parallel skew and prefix-doubling suffix array construction on the GPU

机译:在GPU上快速并行倾斜和前缀加倍的后缀数组构造

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Suffix arrays are fundamental full-text index data structures of importance to a broad spectrum of applications in such fields as bioinformatics, Burrows–Wheeler transform-based lossless data compression, and information retrieval. In this work, we propose and implement two massively parallel approaches on the graphics processing unit (GPU) based on two classes of suffix array construction algorithms. The first, parallel skew, makes algorithmic improvements to the previous work of Deo and Keely to achieve a speedup of 1.45x over their work. The second, a hybrid skew and prefix-doubling implementation, is the first of its kind on the GPU and achieves a speedup of 2.3–4.4x over Osipov's prefix-doubling and 2.4–7.9x over our skew implementation on large datasets. Our implementations rely on two efficient parallel primitives, a merge and a segmented sort. We theoretically analyze the two formulations of suffix array construction algorithms and show performance comparisons on a large variety of practical inputs. We conclude that, with the novel use of our efficient segmented sort, prefix-doubling is more competitive than skew on the GPU. We also demonstrate the effectiveness of our methods in our implementations of the Burrows-Wheeler transform and in a parallel full-text, minute-space-index for pattern searching. Copyright © 2016 John Wiley & Sons, Ltd.
机译:后缀数组是基本的全文本索引数据结构,对于生物信息学,基于Burrows-Wheeler变换的无损数据压缩和信息检索等领域的广泛应用具有重要意义。在这项工作中,我们基于两类后缀数组构造算法在图形处理单元(GPU)上提出并实现了两种大规模并行方法。第一个平行偏斜对Deo和Keely的先前工作进行了算法改进,以使其工作速度提高1.45倍。第二种是混合偏斜和前缀加倍实现,是GPU上的同类产品中的第一个,比Osipov的前缀加倍提高2.3-4.4倍,比大数据集上偏斜实现提高2.4-7.9倍。我们的实现依赖于两个有效的并行原语,即合并和分段排序。我们从理论上分析了后缀数组构造算法的两种公式,并在各种实际输入上显示了性能比较。我们得出结论,通过新颖地使用我们有效的分段排序,前缀加倍比GPU上的偏移更具竞争力。我们还在Burrows-Wheeler转换的实现中以及在并行的全文本分钟空间索引用于模式搜索中展示了我们方法的有效性。版权所有©2016 John Wiley&Sons,Ltd.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号