首页> 外文会议>International Conference on Security and Cryptography >Behavior-based Malware Analysis using Profile Hidden Markov Models
【24h】

Behavior-based Malware Analysis using Profile Hidden Markov Models

机译:基于行为的恶意软件分析使用Profile Hidden Markov模型

获取原文

摘要

In the area of malware analysis, static binary analysis techniques are becoming increasingly difficult with the code obfuscation methods and code packing employed when writing the malware. The behavior-based analysis techniques are being used in large malware analysis systems because of this reason. In these dynamic analysis systems, the malware samples are executed and monitored in a controlled environment using tools such as CWSandbox(Willems et al., 2007). In previous works, a number of clustering and classification techniques from machine learning and data mining have been used to classify the malwares into families and to identify even new malware families, from the behavior reports. In our work, we propose to use the Profile Hidden Markov Model to classify the malware files into families or groups based on their behavior on the host system. PHMM has been used extensively in the area of bioinformatics to search for similar protein and DNA sequences in a large database. We see that using this particular model will help us overcome the hurdle posed by polymorphism that is common in malware today. We show that the classification accuracy is high and comparable with the state-of-art-methods, even when using very few training samples for building models. The experiments were on a dataset with 24 families initially, and later using a larger dataset with close to 400 different families of malware. A fast clustering method to group malware with similar behaviour following the scoring on the PHMM profile database was used for the large dataset. We have presented the challenges in the evaluation methods and metrics of clustering on large number of malware files and show the effectiveness of using profile hidden model models for known malware families.
机译:在恶意软件分析领域,静态二进制分析技术与在编写恶意软件时使用的代码混淆方法和代码包越来越困难。由于这个原因,基于行为的分析技术是在大恶意软件分析系统中使用的。在这些动态分析系统中,使用诸如CWSandbox(Willems等,2007)的工具,在受控环境中执行和监视恶意软件样本。在以前的作品中,来自机器学习和数据挖掘的许多群集和分类技术已被用于将恶意恶魔分类为家庭,并从行为报告中识别新的恶意软件系列。在我们的工作中,我们建议使用Profile Hidden Markov模型将恶意软件文件对基于主机系统的行为分类为Families或Group。 PHMM已广泛使用在生物信息学面积中,以搜索大型数据库中的类似蛋白质和DNA序列。我们看到,使用这种特殊模型将有助于我们克服在今天恶意软件中常见的多态性构成的障碍。我们表明,即使在使用很少的建筑模型的训练样本时,也可以与最先进的方法相比,分类精度高。实验在数据集上,最初是24个家庭的数据集,后来使用较大的数据集,该数据集接近400个不同的恶意软件。在PHMM配置文件数据库上的评分后,将具有类似行为的小组恶意软件的快速聚类方法用于大型数据集。我们在大量恶意软件文件上介绍了聚类的评估方法和指标中的挑战,并显示使用已知恶意软件系列的配置文件隐藏模型模型的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号