...
首页> 外文期刊>Information Security Technical Report >Getting to the root of the problem: A detailed comparison of kernel and user level data for dynamic malware analysis
【24h】

Getting to the root of the problem: A detailed comparison of kernel and user level data for dynamic malware analysis

机译:找出问题的根源:对内核和用户级别数据进行详细比较以进行动态恶意软件分析

获取原文
获取原文并翻译 | 示例
           

摘要

Dynamic malware analysis is fast gaining popularity over static analysis since it is not easily defeated by evasion tactics such as obfuscation and polymorphism. During dynamic analysis it is common practice to capture the system calls that are made to better understand the behaviour of malware. There are several techniques to capture system calls, the most popular of which is a user-level hook. To study the effects of collecting system calls at different privilege levels and viewpoints, we collected data at a process-specific user-level using a virtualised sandbox environment and a system-wide kernel-level using a custom-built kernel driver. We then tested the performance of several state-of-the-art machine learning classifiers on the data. Random Forest was the best performing classifier with an accuracy of 95.2% for the kernel driver and 94.0% at a user-level. The combination of user and kernel level data gave the best classification results with an accuracy of 96.0% for Random Forest. This may seem intuitive but was hitherto not empirically demonstrated. Additionally, we observed that machine learning algorithms trained on data from the user-level tended to use the anti-debug/ anti-vm features in malware to distinguish it from benignware. Whereas, when trained on data from our kernel driver, machine learning algorithms seemed to use the differences in the general behaviour of the system to make their prediction, which explains why they complement each other so well. Our results show that capturing data at different privilege levels will affect the classifier's ability to detect malware, with kernel-level providing more utility than user-level for malware classification. Despite this, there exist more established user-level tools than kernel-level tools, suggesting more research effort should be directed at kernel-level. In short, this paper provides the first objective, evidence-based comparison of user and kernel level data for the purposes of malware classification. (C) 2019 The Authors. Published by Elsevier Ltd.
机译:动态恶意软件分析比静态分析迅速普及,因为动态化恶意软件分析不容易被混淆和多态性等规避策略所击败。在动态分析过程中,通常的做法是捕获为更好地了解恶意软件行为而进行的系统调用。有几种捕获系统调用的技术,其中最流行的是用户级挂钩。为了研究在不同特权级别和观点下收集系统调用的效果,我们使用虚拟化的沙箱环境在特定于进程的用户级别收集数据,并使用定制的内核驱动程序在系统范围的内核级别收集数据。然后,我们在数据上测试了几个最新的机器学习分类器的性能。随机森林是性能最好的分类器,内核驱动程序的准确度为95.2%,用户级别的准确度为94.0%。用户和内核级数据的组合给出了最佳分类结果,对于随机森林,其准确度为96.0%。这看起来似乎很直观,但是到目前为止还没有经验证明。此外,我们观察到在用户级别的数据上训练的机器学习算法倾向于使用恶意软件中的反调试/反虚拟机功能来将其与良性软件区分开。而当对来自我们的内核驱动程序的数据进行训练时,机器学习算法似乎利用系统一般行为的差异来进行预测,这解释了它们为什么能很好地互补。我们的结果表明,以不同的特权级别捕获数据将影响分类器检测恶意软件的能力,内核级别的恶意程序分类功能比用户级别的实用程序更多。尽管如此,与内核级工具相比,存在更多的已建立的用户级工具,这表明应该将更多的研究精力用于内核级。简而言之,本文提供了第一个客观的,基于证据的用户和内核级数据比较,以进行恶意软件分类。 (C)2019作者。由Elsevier Ltd.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号