首页> 外文期刊>IEEE Transactions on Computers >Unified Designs for High Performance LDPC Decoding on GPGPU
【24h】

Unified Designs for High Performance LDPC Decoding on GPGPU

机译:GPGPU上高性能LDPC解码的统一设计

获取原文
获取原文并翻译 | 示例
       

摘要

Modern GPGPU's have enabled massively parallel computing with programmability that can exploit the highly parallel nature of LDPC decoding. Previous works customized the design on a GPGPU towards specific execution attributes of a particular LDPC decoding matrix. Supporting different LDPC decoding matrices requires either substantial rework on the current program, or a brand new parallel design. This paper proposes two unified designs that can achieve high performance for both regular and irregular LDPC decoding on a GPGPU. The first design introduces a node-based scheme with a versatile translation array mechanism that can efficiently handle the complex data access patterns of different LDPC decoding matrices. The second design proposes an edge-based parallel paradigm that uses more intuitive data layout. More edges than nodes in a Tanner graph also give the edge-based design higher computation parallelism when there are limited concurrent codewords. With the proposed unified designs, designers can be ignorant of the types of LDPC matrices and achieve high performance LDPC decoding. The experiments on a GTX 470 GPGPU have demonstrated up to 134.56x runtime improvement, when compared with designs on a high-end CPU. The maximum throughput can reach 80.25 Mbps. When compared with the previous customized designs, the proposed systematic designs can reach better performance while relieving the effort of customization.
机译:现代GPGPU通过可编程性实现了大规模并行计算,可以利用LDPC解码的高度并行性。先前的工作针对特定LDPC解码矩阵的特定执行属性在GPGPU上定制了设计。要支持不同的LDPC解码矩阵,需要对当前程序进行大量修改,或者需要全新的并行设计。本文提出了两个统一的设计,可以在GPGPU上实现常规和不规则LDPC解码的高性能。第一种设计引入了具有通用转换阵列机制的基于节点的方案,该机制可以有效处理不同LDPC解码矩阵的复杂数据访问模式。第二种设计提出了一种基于边缘的并行范例,该范例使用更直观的数据布局。当有限的并发码字时,比Tanner图中的节点多的边缘也使基于边缘的设计具有更高的计算并行度。通过提出的统一设计,设计人员可以不了解LDPC矩阵的类型,而可以实现高性能LDPC解码。与高端CPU上的设计相比,在GTX 470 GPGPU上进行的实验表明运行时间最多可提高134.56倍。最大吞吐量可以达到80.25 Mbps。与以前的定制设计相比,建议的系统设计可以达到更好的性能,同时减轻了定制工作的负担。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号