...
首页> 外文期刊>Journal of supercomputing >Enabling fast and energy-efficient FM-index exact matching using processing-near-memory
【24h】

Enabling fast and energy-efficient FM-index exact matching using processing-near-memory

机译:通过处理近存储器启用快速和节能的FM-Index精确匹配

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Memory bandwidth and latency constitutes a major performance bottleneck for many data-intensive applications. While high-locality access patterns take advantage of the deep cache hierarchies available in modern processors, unpredictable low-locality patterns cause a significant part of the execution time to be wasted waiting for data. An example of those memory bound applications is the exact matching algorithm based on FM-index, used in some well-known sequence alignment applications. Processing-Near-Memory (PNM) has been proposed as a strategy to overcome the memory wall problem, by placing computation close to data, speeding up memory bound workloads by reducing data movements. This paper presents a performance and energy evaluation of two classes of processor architectures when executing the FM-index exact matching algorithm, as a reference algorithm for exact sequence alignment. One architecture class is processor-centric, based on complex cores and DDR3/4 SDRAM memory technology. The other architecture class is memory-centric, based on simple cores and ultra-high-bandwidth hybrid memory cube (HMC) 3D-stacked memory technologies. The results show that the PNM solution improves performance between 1.26x and 3.7x and the energy consumption per operation is reduced between 21x and 40x. In addition, a synthetic benchmark RANDOM was developed that mimics the memory access pattern of the FM-index exact matching algorithm, but with a user configurable operational intensity. This benchmark allows us to extend the evaluation to the class of algorithms with similar memory behaviour but running over a range of operational intensity values.
机译:内存带宽和延迟构成了许多数据密集型应用的主要性能瓶颈。虽然高地访问模式利用现代处理器中可用的深缓存层次结构,但不可预测的低局部模式导致浪费等待数据的大部分执行时间。这些内存绑定应用程序的示例是基于FM-Index的精确匹配算法,用于一些众所周知的序列对齐应用。已经提出了处理近存储器(PNM)作为克服存储器壁问题的策略,通过将计算靠近数据置于数据,通过减少数据移动来加速存储器绑定工作负载。本文介绍了在执行FM索引精确匹配算法时两类处理器架构的性能和能量评估,作为精确序列对齐的参考算法。一个架构类是以经过处理器为中心的,基于复杂的核心和DDR3 / 4 SDRAM存储技术。基于简单的核和超高带宽混合存储器立方体(HMC)3D堆叠内存技术,其他架构类是以内存为中心的。结果表明,PNM解决方案提高了1.26倍和3.7倍的性能,每次操作的能耗降低了21倍和40倍。此外,开发了一种综合基准,随机模拟了FM-Index精确匹配算法的内存访问模式,但具有用户可配置的操作强度。该基准测试允许我们将评估扩展到具有相似存储器行为的算法类,但运行在一系列操作强度值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号