【24h】

On Longest Repeat Queries Using GPU

机译:使用GPU进行最长重复查询

获取原文

摘要

Repeat finding in strings has important applications in sub-fields such as computational biology. The challenge of finding the longest repeats covering particular string positions was recently proposed and solved by Ileri et al., using a total of the optimal O(n) time and space, where n is the string size. However, their solution can only find the leftmost longest repeat for each of the n string position. It is also not known how to parallelize their solution. In this paper, we propose a new solution for longest repeat finding, which although is theoretically suboptimal in time but is conceptually simpler and works faster and uses less memory space in practice than the optimal solution. Further, our solution can find all longest repeats of every string position, while still maintaining a faster processing speed and less memory space usage. Moreover, our solution is parallelizable in the shared memory architecture (SMA), enabling it to take advantage of the modern multi-processor computing platforms such as the general-purpose graphics processing units (GPU). We have implemented both the sequential and parallel versions of our solution. Experiments with both biological and non-biological data show that our sequential and parallel solutions are faster than the optimal solution by a factor of 2-3.5 and 6-14, respectively, and use less memory space.
机译:在字符串中重复查找在子领域(例如计算生物学)中具有重要的应用。 Ileri等人最近提出并解决了寻找覆盖特定弦位置的最长重复序列的挑战,并使用了最佳O(n)时间和空间的总和,其中n是弦长。但是,他们的解决方案只能找到n个字符串位置中每一个的最左最长的重复。还不知道如何并行化他们的解决方案。在本文中,我们提出了一种用于最长重复发现的新解决方案,该解决方案虽然在理论上在时间上不理想,但在概念上更简单,工作更快,并且在实践中使用的内存空间少于最佳解决方案。此外,我们的解决方案可以找到每个字符串位置的所有最长重复,同时仍保持更快的处理速度和更少的内存空间使用率。此外,我们的解决方案可在共享内存架构(SMA)中实现并行化,从而使其能够利用现代多处理器计算平台,例如通用图形处理单元(GPU)。我们已经实现了解决方案的顺序和并行版本。使用生物学和非生物学数据进行的实验均表明,我们的顺序和并行解决方案比最佳解决方案要快2到3.5倍和6到14倍,并且使用的存储空间更少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号