Benefits of Adding Hardware Support for Broadcast and Reduce Operations in MPSoC Applications

YUANXI PENG; MANUEL SALDANA; CHRISTOPHER A. MADILL; XIAOFENG ZOU; PAUL CHOW

首页> 外文期刊>ACM transactions on reconfigurable technology and systems >Benefits of Adding Hardware Support for Broadcast and Reduce Operations in MPSoC Applications

【24h】

Benefits of Adding Hardware Support for Broadcast and Reduce Operations in MPSoC Applications

机译：增加对广播的硬件支持并减少MPSoC应用程序中的操作的好处

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

MPI has been used as a parallel programming model for supercomputers and clusters and recently in Multi-Processor Systems-on-Chip (MPSoC). One component of MPI is collective communication and its performance is key for certain parallel applications to achieve good speedups. Previous work showed that, with synthetic communication-only benchmarks, communication improvements of up to 11.4-fold and 22-fold for broadcast and reduce operations, respectively, can be achieved by providing hardware support at the network level in a Network-on-Chip (NoC). However, these numbers do not provide a good estimation of the advantage for actual applications, as there are other factors that affect performance besides communications, such as computation. To this end, we extend our previous work by evaluating the impact of hardware support over a set of five parallel application kernels of varying computation-to-communication ratios. By introducing some useful computation to the performance evaluation, we obtain more representative results of the benefits of adding hardware support for broadcast and reduce operations. The experiments show that applications with lower computation-to-communication ratios benefit the most from hardware support as they highly depend on efficient collective communications to achieve better scalability. We also extend our work by doing more analysis on clock frequency, resource usage, power, and energy. The results show reasonable scalability for resource utilization and power in the network interfaces as the number of channels increases and that, even though more power is dissipated in the network interfaces due to the added hardware, the total energy used can still be less if the actual speedup is sufficient. The application kernels are executed in a 24-embedded-processor system distributed across four FPGAs.

机译：MPI已被用作超级计算机和集群的并行编程模型，最近还被用作片上多处理器系统（MPSoC）。 MPI的一个组件是集体通信，它的性能对于某些并行应用程序实现良好的加速至关重要。以前的工作表明，使用仅通信的综合基准，可以通过在片上网络中提供网络级别的硬件支持，分别将广播和减少操作的通信效率分别提高11.4倍和22倍。（NoC）。但是，这些数字不能很好地估计实际应用的优势，因为除了通信以外，还有其他一些影响性能的因素，例如计算。为此，我们通过评估硬件支持对一组五个计算/通信比率不同的并行应用程序内核的影响来扩展我们的先前工作。通过在性能评估中引入一些有用的计算，我们获得了增加广播支持和减少操作的好处的更具代表性的结果。实验表明，具有较低计算与通信比率的应用程序从硬件支持中受益最大，因为它们高度依赖有效的集体通信来实现更好的可伸缩性。我们还通过对时钟频率，资源使用，功率和能量进行更多分析来扩展我们的工作。结果表明，随着通道数量的增加，网络接口中的资源利用和功耗具有了合理的可扩展性，即使由于增加了硬件而在网络接口中耗散了更多的功率，但如果实际使用的总能量仍然较少加速就足够了。应用程序内核在分布于四个FPGA的24嵌入式处理器系统中执行。

著录项

来源
《ACM transactions on reconfigurable technology and systems》 |2014年第3期|17.1-17.23|共23页
作者
YUANXI PENG; MANUEL SALDANA; CHRISTOPHER A. MADILL; XIAOFENG ZOU; PAUL CHOW;
展开▼
作者单位

College of Computer, National University of Defense Technology, Changsha, Hunan, P. R. China, 410073;

ArchES Computing Systems Corp., Toronto, ON, Canada;

ArchES Computing Systems Corp., Toronto, ON, Canada;

College of Computer, National University of Defense Technology, Changsha, Hunan, P. R. China, 410073;

Department of Electrical and Computer Engineering, University of Toronto, ON, Canada;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
MPI; FPGA; parallel computing; multiprocessor; network-on-chip;

机译：MPI;FPGA;并行计算;多处理器片上网络;

相似文献

外文文献
中文文献
专利

1. ADD SYSTEMS NAMES NEW DIRECTOR OF CUSTOMER SUPPORT & HARDWARE OPERATIONS [J] . Butane-Propane News group Butane-Propane News . 2021,第7期

机译：添加系统名称客户支持和硬件操作的新任主任
2. Performance Improvement of Hardware/Software Architecture for Real-Time Bio Application Using MPSoC [J] . Prasath Raveendran Arun, Kumar Parasuraman Ganesh, Sakthivel Erulappan Intelligent automation and soft computing . 2017,第2期

机译：使用MPSoC的实时生物应用的硬件/软件体系结构的性能改进
3. MPSoC application resilience by hardware-assisted communication virtualization [J] . Roesch S., Rauchfuss H., Wallentowitz S., Microelectronics & Reliability . 2016,第JUNa期

机译：通过硬件辅助的通信虚拟化实现MPSoC应用弹性
4. Hardware Support for Broadcast and Reduce in MPSoC [C] . Peng Yuanxi, Saldana Manuel, Chow Paul 21st International Conference on Field Programmable Logic and Applications . 2011

机译：MPSoC中广播和精简的硬件支持
5. New hardware support for compute-intensive database and data stream operations. [D] . Bandi, Nagender. 2007

机译：对计算密集型数据库和数据流操作的新硬件支持。
6. Adding Active Slot Joint Larger Broadcast Radius for Fast Code Dissemination in WSNs [O] . Wei Yang, Wei Liu, Zhiwen Zeng, 2018

机译：添加活动时隙联合更大的广播半径以在WSN中快速分发代码
7. Adding Hardware Support to the HotSpot Virtual Machine for Domain Specific Applications [O] . Yajun Ha, Radovan Hipik, Serge Vernalde, 2002

机译：为特定于域的应用程序添加对Hotspot虚拟机的硬件支持

Benefits of Adding Hardware Support for Broadcast and Reduce Operations in MPSoC Applications

摘要

著录项

相似文献

相关主题

期刊订阅