首页> 外文期刊>Journal of Parallel and Distributed Computing >PMSS: A programmable memory system and scheduler for complex memory patterns
【24h】

PMSS: A programmable memory system and scheduler for complex memory patterns

机译:PMSS:用于复杂存储模式的可编程存储系统和调度程序

获取原文
获取原文并翻译 | 示例

摘要

HPC industry demands more computing units on FPGAs, to enhance the performance by using task/data parallelism. FPGAs can provide its ultimate performance on certain kernels by customizing the hardware for the applications. However, applications are getting more complex, with multiple kernels and complex data arrangements, generating overhead while scheduling/managing system resources. Due to this reason all classes of multi threaded machines - minicomputer to supercomputer - require to have efficient hardware scheduler and memory manager that improves the effective bandwidth and latency of the DRAM main memory. This architecture could be a very competitive choice for supercomputing systems that meets the demand of parallelism for HPC benchmarks. In this article, we proposed a Programmable Memory System and Scheduler (PMSS), which provides high speed complex data access pattern to the multi threaded architecture. This proposed PMSS system is implemented and tested on a Xilinx ML505 evaluation FPGA board. The performance of the system is compared with a microprocessor based system that has been integrated with the Xilkernel operating system. Results show that the modified PMSS based multi-accelerator system consumes 50% less hardware resources, 32% less on-chip power and achieves approximately a 19x speedup compared to the MicroBlaze based system.
机译:HPC行业要求FPGA上有更多的计算单元,以通过使用任务/数据并行性来提高性能。通过定制应用程序的硬件,FPGA可以在某些内核上提供其最终性能。但是,应用程序变得越来越复杂,具有多个内核和复杂的数据安排,从而在计划/管理系统资源时产生了开销。由于这个原因,从微型计算机到超级计算机的所有类型的多线程计算机都需要具有有效的硬件调度程序和内存管理器,以改善DRAM主内存的有效带宽和延迟。对于满足HPC基准的并行性需求的超级计算系统而言,该体系结构可能是一个非常有竞争力的选择。在本文中,我们提出了可编程存储器系统和调度程序(PMSS),该程序为多线程体系结构提供了高速复杂的数据访问模式。该拟议的PMSS系统在Xilinx ML505评估FPGA板上实现和测试。将系统的性能与已与Xilkernel操作系统集成的基于微处理器的系统进行比较。结果表明,与基于MicroBlaze的系统相比,基于PMSS的改进型多加速器系统消耗的硬件资源减少了50%,片上功耗减少了32%,并实现了大约19倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号