首页> 外文期刊>Programming and Computer Software >Next-Generation Intermediate Representations for Binary Code Analysis
【24h】

Next-Generation Intermediate Representations for Binary Code Analysis

机译:二进制代码分析的下一代中间表示

获取原文
获取原文并翻译 | 示例

摘要

Many binary code analysis tools rely on intermediate representation (IR) derived from a binary code, instead of working directly with machine instructions. In this paper, we first consider binary code analysis problems that benefit from IR and compile a list of requirements that the IR suitable for solving these problems should meet. Generally speaking, a universal binary analysis platform requires two principal components. The first component is a retargetable instruction decoder that utilizes external specifications to describe target instruction sets. External specifications facilitate maintenance and allow one to quickly implement support for new instruction sets. We analyze some of the most popular instruction set architectures (ISAs), including those used in microcontrollers, and from that compile a list of requirements for the retargetable decoder. We then overview existing multi-ISA decoders and propose our vision of a more generic approach, based on a multi-layer directed acyclic graph that describes the decoding process in universal terms. The second component of the analysis platform is the actual architecture-neutral IR. In this paper, we describe such IRs and propose Pivot 2, an IR that is low-level enough to be easily constructed from decoded machine instructions, also being easy to analyze. The main features of Pivot 2 are explicit side effects, SSA variables, simpler alternative to phi-functions, and extensible elementary operation set at the core. This IR also supports machines that have multiple memory address spaces. Finally, we propose a way to tie the decoder and the IR together to fit them to most of the binary code analysis tasks through abstract interpretation on top of the IR. The proposed scheme takes into account various aspects of target architectures that are overlooked in many other works, including pipeline specifics (handling of delay slots, hardware loop support, etc.), exception and interrupt management, and generic address space model, in which accesses may have arbitrary side effects due to memory-mapped devices or other non-trivial behavior of the memory system.
机译:许多二进制代码分析工具依赖于从二进制代码派生的中间表示(IR),而不是直接与机器指令一起工作。在本文中,我们首先考虑从IR中受益的二进制代码分析问题,并编制了适合解决这些问题的IR的要求列表应该满足。一般而言,通用二进制分析平台需要两个主要成分。第一组件是一种retargetable指令解码器,其利用外部规范来描述目标指令集。外部规格促进维护并允许一个快速实施对新指令集的支持。我们分析了一些最流行的指令集架构(ISAS),包括微控制器中使用的指令架构(ISAS),并从该编译重新标准解码器的要求列表。然后,我们概述了现有的多isa解码器并提出了我们对更通用的方法的愿景,基于多层定向的非循环图,该方法描述了通用术语的解码过程。分析平台的第二个组件是实际的架构中性红外。在本文中,我们描述了这种IRS并提出了枢轴2,IR的IR易于从解码的机器指令容易地构建,也易于分析。 Pivot 2的主要特征是显式副作用,SSA变量,PHI函数的更简单替代,以及在核心设置的可扩展基本操作。此IR还支持具有多个内存地址空间的计算机。最后,我们提出了一种方法来将解码器和IR绑定在一起,以便通过IR顶部的抽象解释将它们符合大多数二进制代码分析任务。拟议的计划考虑了在许多其他作品中被忽视的目标架构的各个方面,包括管道细节(处理延迟时隙,硬件循环支持等),例外和中断管理和通用地址空间模型,在其中访问由于内存映射的设备或存储器系统的其他非琐碎行为,可以具有任意副作用。

著录项

  • 来源
    《Programming and Computer Software》 |2019年第7期|424-437|共14页
  • 作者单位

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia|Moscow MV Lomonosov State Univ Moscow 119991 Russia;

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia;

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia;

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia|Moscow MV Lomonosov State Univ Moscow 119991 Russia;

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia|Moscow MV Lomonosov State Univ Moscow 119991 Russia;

    Russian Acad Sci Ivannikov Inst Syst Programming Ul Solzhenitsyna 25 Moscow 109004 Russia;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号