Impact of Vectorization and Multithreading on Performance and Energy Consumption on Jetson Boards

机译：矢量化和多线程对Jetson板上性能和能耗的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

ARM processors are well known for their energy efficiency and are consequently widely used in embedded platforms. Like other processor architectures, they are built with different levels of parallelism, from Instruction Level Parallelism (out-of- order and superscalar capabilities) to Thread Level Parallelism (multicore), to increase their performance levels. These processors are now also targeting the HPC domain and will equip the Fujitsu Post-K supercomputer. Some ARM processors from the Cortex-A series, which equip smartphones and tablets, also provide Data Level Parallelism through SIMD units called NEON. These units are able to process 128-bit of data at a time, for example four 32bit floating point values. Taking advantage of these units requires code vectorization which may be performed automatically by the compiler or explicitly by using NEON intrinsics. Exploiting all these levels of parallelism may lead to better performance as well as a higher energy consumption. This is not an issue in the HPC domain where application development is driven by finding the best performance. However, developing for embedded applications is driven by finding the best trade-off between energy consumption and performance. In this paper, we propose to study the impact of vectorization and multithreading on both performance and energy consumption on some Nvidia Jetson boards. Results show that depending on the algorithm and on its implementation, vectorization may bring a similar speedup as an OpenMP scalar implementation but with a lower energy consumption. However, combining vectorization and multithreading may lead close to both the best performance level and the lowest energy consumption but not when running cores at their maximum frequencies.

机译：ARM处理器以其能效而闻名，因此广泛应用于嵌入式平台。与其他处理器架构一样，它们是用不同级别的并行性构建，从指令级并行性（Out-Out-Out-OutrycalaL功能）到螺纹级并行性（Multicore），以提高其性能等级。这些处理器现在也针对HPC域，并将装备Fujitsu Post-K超级计算机。来自Cortex-A系列的一些臂处理器，配备智能手机和平板电脑，还通过称为霓虹灯的SIMD单元提供数据级并行性。这些单元能够一次处理128位数据，例如四个32位浮点值。利用这些单元需要代码矢量化，该代码矢量化可以由编译器自动执行，或者通过使用霓虹Instins显式进行。利用所有这些水平的平行度可能导致更好的性能以及更高的能耗。这不是HPC域中的问题，其中通过找到最佳性能驱动应用程序开发。但是，通过在能源消耗和性能之间找到最佳权衡来驱动嵌入式应用程序的开发。在本文中，我们建议研究矢量化和多线程对一些NVIDIA Jetson板上的性能和能耗的影响。结果表明，根据算法和实现，矢量化可能会带来类似的加速作为OpenMP标量实现，但能耗较低。然而，组合矢量化和多线程可能接近最佳性能水平和最低能量消耗，而不是在其最大频率下运行核心时。

著录项

来源
《International Conference on High Performance Computing and Simulation》|2018年|522p|共8页
会议地点
作者
Sylvain Jubertie; Emmanuel Melin; Naly Raliravaka; Emmanuel Bodèle; Pablo Escot Bocanegra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP30-53;
关键词
Energy consumption; Program processors; Current measurement; Multicore processing; Power supplies; Neon;

机译：能量消耗;程序处理器;电流测量;多核加工;电源;霓虹灯;

相似文献

外文文献
中文文献
专利

1. Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy [J] . Rishee K. Jain, Kevin M. Smith, Patricia J. Culligan, Applied Energy . 2014,第juna15期

机译：使用支持向量回归预测多户住宅建筑的能耗：调查时间和空间监控粒度对性能准确性的影响
2. Performance and energy impact of parallelization and vectorization techniques in modern microprocessors [J] . Juan M. Cebrian, Lasse Natvig, Jan Christian Meyer Computing . 2014,第12期

机译：现代微处理器中并行化和矢量化技术的性能和能量影响
3. Impact of renewable energy consumption and financial development on CO_2 emissions and economic growth in the MENA region: A panel vector autoregressive (PVAR) analysis [J] . Charfeddine Lanouar, Kahia Montassar Renewable energy . 2019,第AUGa期

机译：中东和北非地区可再生能源消耗和金融发展对CO_2排放和经济增长的影响：面板向量自回归（PVAR）分析
4. Impact of Vectorization and Multithreading on Performance and Energy Consumption on Jetson Boards [C] . Sylvain Jubertie, Emmanuel Melin, Naly Raliravaka, International Conference on High Performance Computing Simulation . 2018

机译：向量化和多线程对Jetson主板性能和能耗的影响
5. The Impacts of Real-time Knowledge Based Personal Lighting Control on Energy Consumption, User Satisfaction and Task Performance in Offices. [D] . Gu, Yun. 2011

机译：基于实时知识的个人照明控制对办公室的能耗，用户满意度和任务性能的影响。
6. Speedup bioinformatics applications on multicore-based processor using vectorizing and multithreading strategies [O] . Kridsadakorn Chaichoompu, Surin Kittitornkun, Sissades Tongsima 2007

机译：使用矢量化和多线程策略在基于多核的处理器上加速生物信息学应用程序
7. Assessing the Relationship between Oil Prices, Energy Consumption and Macroeconomic Performance in Malaysia: Co-integration and Vector Error Correction Model (VECM) Approach [O] . Nora Yusma Mohamed Yusop, HUSSAIN A. BEKHET 2009

机译：评估马来西亚的石油价格，能源消耗和宏观经济绩效之间的关系：协整和矢量误差校正模型（VECM）方法

Impact of Vectorization and Multithreading on Performance and Energy Consumption on Jetson Boards

摘要

著录项

相似文献

相关主题

期刊订阅