首页> 外文期刊>Information Security Technical Report >Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey
【24h】

Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey

机译:通过将机器学习分类器应用于静态功能来检测恶意代码:最新调查

获取原文
获取原文并翻译 | 示例
           

摘要

This research synthesizes a taxonomy for classifying detection methods of new malicious code by Machine Learning (ML) methods based on static features extracted from execut-ables. The taxonomy is then operationalized to classify research on this topic-and pinpoint critical open research issues in light of emerging threats. The article addresses various facets of the detection challenge, including: file representation and feature selection methods, classification algorithms, weighting ensembles, as well as the imbalance problem, active learning, and chronological evaluation. From the survey we conclude that a framework for detecting new malicious code in executable files can be designed to achieve very high accuracy while maintaining low false positives (i.e. misclassifying benign files as malicious). The framework should include training of multiple classifiers on various types of features (mainly OpCode and byte n-grams and Portable Executable Features), applying weighting algorithm on the classification results of the individual classifiers, as well as an active learning mechanism to maintain high detection accuracy. The training of classifiers should also consider the imbalance problem by generating classifiers that will perform accurately in a real-life situation where the percentage of malicious files among all files is estimated to be approximately 10%.
机译:该研究综合了一种分类法,该分类法基于从可执行文件中提取的静态特征,通过机器学习(ML)方法对新恶意代码的检测方法进行分类。然后对分类法进行操作,以对该主题进行研究分类,并根据新出现的威胁确定关键的开放研究问题。本文介绍了检测挑战的各个方面,包括:文件表示和特征选择方法,分类算法,加权集合以及不平衡问题,主动学习和时间顺序评估。从调查中我们可以得出结论,可以设计一种用于检测可执行文件中新恶意代码的框架,以实现很高的准确性,同时保持较低的误报率(即将良性文件误分类为恶意文件)。该框架应包括针对各种特征(主要是OpCode和字节n-gram和可移植可执行特征)对多个分类器进行训练,对各个分类器的分类结果应用加权算法,以及一种主动学习机制以保持较高的检测率准确性。分类器的培训还应通过生成分类器来考虑不平衡问题,这些分类器将在实际情况下准确执行,在实际情况下,恶意文件在所有文件中的百分比估计约为10%。

著录项

  • 来源
    《Information Security Technical Report》 |2009年第1期|16-29|共14页
  • 作者单位

    Deutsche Telekom Laboratories at Ben-Gurion University, Ben-Gurion University, Be'er Sheva 84105, Israel;

    Deutsche Telekom Laboratories at Ben-Gurion University, Ben-Gurion University, Be'er Sheva 84105, Israel;

    Deutsche Telekom Laboratories at Ben-Gurion University, Ben-Gurion University, Be'er Sheva 84105, Israel;

    Deutsche Telekom Laboratories at Ben-Gurion University, Ben-Gurion University, Be'er Sheva 84105, Israel;

  • 收录信息 美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号