首页> 外文期刊>Computing >Performance and energy impact of parallelization and vectorization techniques in modern microprocessors
【24h】

Performance and energy impact of parallelization and vectorization techniques in modern microprocessors

机译:现代微处理器中并行化和矢量化技术的性能和能量影响

获取原文
获取原文并翻译 | 示例
           

摘要

While Moore's law states that the number of transistors is approximately doubled every 2 years, powering these transistors simultaneously is only possible as long as Dennard scaling continues. Unfortunately, voltage scaling has slowed down in recent years, and microprocessor designers have hit what is known as the "utilization wall" or the "dark silicon" effect. Vectorization, parallelization, specialization and heterogeneity are the key approaches to deal with this utilization wall. However, how software developers can maximize energy efficiency of these architectures remains an open question. This paper presents an energy evaluation of parallelization using both physical and logical cores (i.e., SMT/Hyper-Threading), vectorization (SSE, Advanced Vector Extensions and NEON) and dynamic core reconfiguration [Intel~®'s Turbo Boost Technology (TBT)]. The evaluation spans microprocessors for embedded, laptop, desktop and server markets, since there is a convergence among them towards energy efficiency. The analyzed processors include Intel's Core™ i5 and i7 family and ARM~®'s Cortex™ A9 and A15. Results show that software developers should prioritize vectorization over thread parallelism when possible, as it yields better energy efficiency, especially on the Intel platforms. Application scalability can be reduced drastically when using vectorization and threading simultaneously since vectorization increases pressure on the memory subsystem. Intel's TBT further improves energy efficiency by an additional 10-20 % depending on the number of active threads.
机译:摩尔定律指出,晶体管的数量每两年大约增加一倍,但只有在Dennard缩放比例持续的情况下,才可以同时为这些晶体管供电。不幸的是,近年来电压缩放速度已经放慢,并且微处理器设计者已经达到了所谓的“利用率壁”或“暗硅”效应。向量化,并行化,专业化和异构性是处理此利用率壁的关键方法。但是,软件开发人员如何才能最大限度地提高这些体系结构的能源效率仍然是一个悬而未决的问题。本文介绍了使用物理和逻辑内核(即SMT /超线程),矢量化(SSE,高级矢量扩展和NEON)以及动态内核重新配置[Intel〜®的Turbo Boost Technology(TBT))进行并行化的能量评估。 ]。该评估涵盖了面向嵌入式,笔记本电脑,台式机和服务器市场的微处理器,因为它们之间在能源效率方面趋于一致。经过分析的处理器包括英特尔的Core™i5和i7家族以及ARM〜®的Cortex™A9和A15。结果表明,软件开发人员应在可能的情况下优先考虑向量化,而不是线程并行化,因为这样可以提高能源效率,尤其是在Intel平台上。同时使用向量化和线程处理时,可以大大降低应用程序的可伸缩性,因为向量化会增加内存子系统的压力。英特尔的TBT可以根据活动线程的数量将能源效率进一步提高10-20%。

著录项

  • 来源
    《Computing》 |2014年第12期|1179-1193|共15页
  • 作者单位

    Department of Computer and Information Science (IDI), NTNU, 7491 Trondheim, Norway;

    Department of Computer and Information Science (IDI), NTNU, 7491 Trondheim, Norway;

    High Performance Computing Section, IT Department, NTNU, 7491 Trondheim, Norway;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Performance evaluation; Energy efficiency; Vectorization;

    机译:绩效评估;能源效率;向量化;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号