首页> 外文会议>IEEE/ACM International Symposium on Code Generation and Optimization >Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary Translation
【24h】

Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary Translation

机译:增强跨ISA动态二进制转换的原子指令仿真

获取原文
获取外文期刊封面目录资料

摘要

Dynamic Binary Translation (DBT) is a key enabler for cross-ISA emulation, system virtualization, runtime instrumentation, and many other important applications. Among several critical requirements for DBT, it is important to provide equivalent semantics for atomic synchronization instructions such as Load - Link / Store - Conditional (LL/SC), which are mostly included in the reduced-instruction set architectures (RISC) and Compare-and-Swap(CAS), which is mostly in the complex instruction set architectures (CISC). However, the state-of-the-art DBT tools often do not provide a fully correct translation of these atomic instructions, in particular, from RISC atomic instructions (i.e. LL/SC) to CISC atomic instructions (i.e. CAS), due to performance concerns. As a result, some may cause the well-known ABA problem, which could lead to wrong results or program crashes. In our experimental studies on QEMU, a state-of-the-art DBT, that runs multi-threaded lock-free stack operations implemented with ARM instruction set (i.e. using LL/SC) on Intel x86 platforms (i.e. using CAS), it often crashes within 2 seconds. Although attempts have been made to provide correct emulation for such atomic instructions, they either result in heavy execution overheads or require additional hardware support. In this paper, we propose several schemes to address those issues and implement them on QEMU to evaluate their performance overheads. The results show that all of the proposed schemes can provide correct emulation and, for the best solution, can achieve a min, max, geomean speedup of 1.25x, 3.21x, 2.03x respectively, over the best existing software-based scheme.
机译:动态二进制转换(DBT)是跨ISA仿真,系统虚拟化,运行时仪表以及许多其他重要应用程序的关键推动器。在DBT的几个关键要求中,对于原子同步指令提供等效语义,例如负载 - 链接/存储 - 条件(LL / SC),其主要包括在缩减指令集体系结构(RISC)中并进行比较 - 和交换(CAS),主要是在复杂的指令集架构(CISC)中。然而,由于性能,最先进的DBT工具通常不提供这些原子指令(即LL / SC)到CISC原子指令(即CAS)的完全正确的翻译担心。因此,有些可能导致众所周知的ABA问题,这可能导致错误的结果或计划崩溃。在我们对QEMU的实验研究中,一种最先进的DBT,它在英特尔X86平台上使用ARM指令集(即在使用LL / SC)(即使用CAS),它运行了多线程锁定的堆栈操作经常在2秒内崩溃。虽然已经尝试为这种原子指令提供正确的仿真,但它们既可以导致繁忙的执行开销或需要额外的硬件支持。在本文中,我们提出了几个计划来解决这些问题并在QEMU上实施它们以评估其绩效的开销。结果表明,所有提议的方案都可以提供正确的仿真,并且对于最佳解决方案,可以通过最佳现有基于软件的方案实现1.25倍,3.21倍,2.03倍的Min,Max,GeoMean Speedup。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号