Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations' Perspective

机译：从编译器优化的角度分析64位x86架构的性能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intel Extended Memory 64 Technology (EM64T) and AMD 64-bit architecture (AMD64) are emerging 64-bit x86 architectures that are fully x86 compatible. Compared with the 32-bit x86 architecture, the 64-bit x86 architectures cater some new features to applications. For instance, applications can address 64 bits of virtual memory space, perform operations on 64-bit-wide operands, get access to 16 general-purpose registers (GPRs) and 16 extended multi-media (XMM) registers, and use a register-based argument passing convention. In this paper, we investigate the performance impacts of these new features from compiler optimizations' standpoint. Our research compiler is based on the Intel Fortran/C++ production compiler, and our experiments are conducted on the SPEC2000 benchmark suite. Results show that for 64-bit-wide pointer and long data types, several SPEC2000 C benchmarks are slowed down by more than 20%, which is mainly due to the enlarged memory footprint. To evaluate the performance potential of 64-bit x86 architectures, we designed and implemented the LP32 code model such that the sizes of pointer and long are 32 bits. Our experiments demonstrate that on average the LP32 code model speeds up the SPEC2000 C benchmarks by 13.4%. For the register-based argument passing convention, our experiments show that the performance gain is less than 1% because of the aggressive function inlining optimization. Finally, we observe that using 16 GPRs and 16 XMM registers significantly outperforms the scenario when only 8 GPRs and 8 XMM registers are used. However, our results also show that using 12 GPRs and 12 XMM registers can achieve as competitive performance as employing 16 GPRs and 16 XMM registers.

机译：英特尔扩展内存64技术（EM64T）和AMD 64位体系结构（AMD64）是新兴的完全与x86兼容的64位x86体系结构。与32位x86体系结构相比，64位x86体系结构为应用程序提供了一些新功能。例如，应用程序可以寻址64位虚拟内存空间，对64位宽的操作数执行操作，可以访问16个通用寄存器（GPR）和16个扩展多媒体（XMM）寄存器，以及使用以下寄存器：基于参数传递约定。在本文中，我们将从编译器优化的角度研究这些新功能的性能影响。我们的研究编译器基于Intel Fortran / C ++生产编译器，并且我们的实验是在SPEC2000基准套件上进行的。结果表明，对于64位宽的指针和长数据类型，某些SPEC2000 C基准测试速度降低了20％以上，这主要是由于内存占用量增大所致。为了评估64位x86架构的性能潜力，我们设计并实现了LP32代码模型，以使指针和long的大小为32位。我们的实验表明，平均而言，LP32代码模型可使SPEC2000 C基准测试速度提高13.4％。对于基于寄存器的参数传递约定，我们的实验表明，由于积极的函数内联优化，性能提升不到1％。最后，我们观察到只有16个GPR和8个XMM寄存器被使用时，使用16个GPR和16个XMM寄存器显着优于方案。但是，我们的结果还表明，使用12个GPR和12个XMM寄存器可以达到与使用16个GPR和16个XMM寄存器一样的竞争性能。

著录项

来源
《International Conference on Compiler Construction(CC 2006) Held as Part of the Joint European Conferences on Theory and Practice of Software(ETAPS 2006); 20060330-31; Vienna(AT)》|2006年|P.155-169|共15页
会议地点 Vienna(AT)
作者
Jack Liu; Youfeng Wu;
展开▼
作者单位

Intel Corporation, 2200 Mission Blvd, Santa Clara, CA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Performance characterization of optimizing compilers [J] . Saavedra R.H., Smith A.J. IEEE Transactions on Software Engineering . 1995,第7期

机译：优化编译器的性能表征
2. Modern x86 assembly language programming: 32-bit, 64-bit, SSE, and AVX [J] . Nathan Carlson Computing reviews . 2016,第2期

机译：现代的x86汇编语言编程：32位，64位，SSE和AVX
3. VIA Nano X2 E-Series dual-core processors debut for 64-bit X86 apps [J] . Electronic Engineering Times . 2011,第1603期

机译：威盛Nano X2 E系列双核处理器首次亮相64位X86应用
4. Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations’ Perspective [C] . Jack Liu, Youfeng Wu Held as Part of the Joint European Conferences on Theory and Practice of Software . 2006

机译：从编译器优化'透视的64位x86架构的性能表征
5. Performance Comparison of Projective Elliptic-curve Point Multiplication in 64-bit x86 Runtime Environment. [D] . Ninh, Winston. 2014

机译：在64位x86运行时环境中投影椭圆曲线点乘法的性能比较。
6. Compiler Optimizations as a Countermeasure against Side-Channel Analysis in MSP430-Based Devices [O] . Pedro Malagón, Juan-Mariano de Goyeneche, Marina Zapater, 2012

机译：编译器优化作为基于MSP430的设备的边通道分析的对策
7. Performance optimization for the k-nearest neighbors kernel on x86 architectures [O] . Chenhan D. Yu, Jianyu Huang, Woody Austin, 2015

机译：X86架构上的K-Indect邻居内核的性能优化
8. Compiler-Driven Performance Optimization and Tuning for Multicore Architectures. [R] . Ramanujam, J. 2015

机译：多核架构的编译器驱动的性能优化和调优。

Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations' Perspective

摘要

著录项

相似文献

相关主题

期刊订阅