Load balancing, broadcast, and scatter primitives for efficient multicore applications

机译：用于高效多核应用的负载平衡，广播和散射原语

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Efficient parallel execution of scientific and transaction-oriented applications requires reducing communication/synchronization overheads by improving locality using explicit methods that capturet underlying access patterns. In this work, we propose low-cost hardware that supports load balancing and parallel broadcast/scatter macro-operations. We evaluate these primitives using a cycle-accurate SystemC virtual platform of a multicore System-on-Chip (SoC) that interconnects cycle-accurate processor models (Cortex-A9) and a memory hierarchy via a hypercube Network-on-Chip (NoC). Results from executing a typical parallel matrix multiplication benchmark on a small-range embedded multicore SoC, indicate average execution time improvements of 25% for load balancing, 21% for broadcast/scatter primitives and 50% collectively, when utilizing both primitives. While load balancing relies only on remote shared-memory access principles, synthesis on Zedboard's Zynq 7020 FPGA indicates a very low area cost for scatter operation compared to an industrial DMA-based scatter/gather solution.

机译：有效的并行执行科学和交易导向的应用程序需要通过使用Capturet底层访问模式的显式方法改进局部性来降低通信/同步开销。在这项工作中，我们提出了低成本的硬件，支持负载平衡和并行广播/散点宏操作。我们使用多核系统上的芯片系统（SOC）的周期准确的Systemc虚拟平台来评估这些原语，该系统通过超级网络上（NOC）互连周期准确的处理器模型（Cortex-A9）和内存层次结构。结果是在小范围嵌入式多核SoC上执行典型的并行矩阵乘法基准，表明在利用两个基元时，为广播/散射基元的负载平衡和50％的平均执行时间提高为25％。虽然负载平衡仅依赖于远程共享存储器访问原理，但Zedboard的Zynq 7020 FPGA上的合成表示与基于工业DMA的散点/聚耦相比的散射操作的极低面积成本。

著录项

来源
《International Workshop on Intelligent Solutions in Embedded Systems》|2015年||共6页
会议地点
作者
Grammatikakis Miltos D.; Papagrigoriou Antonis; Petrakis Polydoros; Harteros Kostas; Kornaros George;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Arrays; Hardware; Instruction sets; Load management; Message systems; Monitoring; Multicore processing; NoC; broadcast; load balancing; multicore SoC; scatter; scientific applications; shared memory;

机译：阵列;硬件;指令集;负载管理;消息系统;监控;多核加工;NOC;广播;负载平衡;多芯SOC;散射;科学应用;共享内存;

相似文献

外文文献
中文文献
专利

1. Energy-Efficient Multi-Constraint Routing Algorithm With Load Balancing for Smart City Applications [J] . Internet of Things Journal, IEEE . 2016,第6期

机译：具有负载平衡功能的节能型多约束路由算法，适用于智慧城市应用
2. Efficient Load Balancing for Wide-Area Divide-and-Conquer Applications [J] . Rob V. van Nieuwpoort, Thilo Kielmann, Henri E. Bal ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2001,第7期

机译：针对广域分而治之应用的高效负载平衡
3. Energy-efficient load balancing in wireless sensor network: An application of multinomial regression analysis: [J] . Ruwaida M Zuhairy, Mohammed GH Al Zamil International Journal of Distributed Sensor Networks . 2018,第3期

机译：无线传感器网络中的节能负载平衡：多项式回归分析的应用：
4. Load balancing, broadcast, and scatter primitives for efficient multicore applications [C] . Grammatikakis Miltos D., Papagrigoriou Antonis, Petrakis Polydoros, Proceedings of 2015 12th International Workshop on Intelligent Solutions in Embedded Systems . 2015

机译：负载均衡，广播和分散原语，用于高效的多核应用
5. Power Efficient Scheduling for Network Applications on Multicore Architecture. [D] . Kuang, Jilong. 2011

机译：多核体系结构上网络应用程序的节能调度。
6. A High Performance Load Balance Strategy for Real-Time Multicore Systems [O] . Keng-Mao Cho, Chun-Wei Tsai, Yi-Shiuan Chiu, -1

机译：实时多核系统的高性能负载平衡策略
7. An Efficient Load Balancing Technique in a Multicore Mobile System [O] . Jungseok Cho, Doosan Cho 2015

机译：多核移动系统中有效负载平衡技术
8. Adaptive Load-Balancing Algorithms using Symmetric Broadcast Networks [R] . Das, Sajal K., Harvey, Daniel J., Biswas, Rupak 2002

机译：使用对称广播网络的自适应负载平衡算法

Load balancing, broadcast, and scatter primitives for efficient multicore applications

摘要

著录项

相似文献

相关主题

期刊订阅