首页> 外文会议>International Conference on Malicious and Unwanted Software >First byte: Force-based clustering of filtered block N-grams to detect code reuse in malicious software
【24h】

First byte: Force-based clustering of filtered block N-grams to detect code reuse in malicious software

机译:第一个字节:基于力的已过滤块N-gram聚类,以检测恶意软件中的代码重用

获取原文

摘要

Detecting code reuse in malicious software is complicated by the lack of source code. The same circumstance that makes code reuse detection in malicious software desirable, that is, the limited availability of original source code, also contributes to the difficulty of detecting code reuse. In this paper, we propose a method for detecting code reuse in software, specifically malicious software, that moves beyond the limitations of targeting variant detection (categorization of families). This method expands n-gram analysis to target basic blocks extracted from compiled code vice entire text sections. It also targets individual relationships between basic blocks found in localized code reuse, while preserving the ability to detect variants and families of variants found with generalized code reuse. We demonstrate the limitations of similarity calculated without first disassembling the instructions and show that our First Byte normalization gives dramatic improvements in detection of code reuse. To visualize results, our method proposes force-based clustering as a solution to rapidly detect relationships between compiled binaries and detect relationships without complex analysis. Our methods retain the previously demonstrated ability of n-gram analysis to detect variants, while adding the ability to detect code reuse in non-variant malware. We show that our proposed filtering method reduces the number of similarity calculations and highlights only meaningful relationships in our malware set.
机译:由于缺少源代码,因此检测恶意软件中的代码重用变得很复杂。使恶意软件中的代码重用检测成为可取的相同情况,即原始源代码的有限可用性,也增加了检测代码重用的难度。在本文中,我们提出了一种用于检测软件(尤其是恶意软件)中的代码重用的方法,该方法超越了针对变体检测(家族分类)的限制。此方法将n-gram分析扩展到目标目标块,这些基本块是从编译的代码中提取的,然后是整个文本部分。它还针对本地化代码重用中发现的基本块之间的个体关系,同时保留了检测通用代码重用中发现的变体和变体家族的能力。我们演示了在不首先拆卸指令的情况下计算出的相似度的局限性,并表明我们的“第一字节”规范化在检测代码重用方面带来了显着改善。为了使结果可视化,我们的方法提出了基于力的聚类作为快速检测已编译二进制文件之间的关系并无需复杂分析即可检测关系的解决方案。我们的方法保留了先前展示的n-gram分析检测变体的功能,同时增加了检测非变体恶意软件中代码重用的能力。我们表明,我们提出的过滤方法减少了相似性计算的数量,并仅突出了恶意软件集中有意义的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号