首页> 外文会议>International Conference on Digital Information Management >Classification of computer viruses from binary code using ensemble classifier and recursive feature elimination
【24h】

Classification of computer viruses from binary code using ensemble classifier and recursive feature elimination

机译:使用集成分类器和递归特征消除从二进制代码分类计算机病毒

获取原文

摘要

This paper proposes a supervised machine learning model for detecting (unseen) viruses files. Our main focus is on static analysis approach. To find the best method, we experiment with difference types of feature extraction and three classifier algorithms including extreme gradient boosting, random forest and multilayer perceptron. Our data set contains 6,319 executable files. Each file is extracted with objdump and sorted with TF-IDF score to find best features. The F1 score shows slightly better performance than those of the baselines. Random forest with 20 attributes yields 0.9379758 F1 score which is 0.0316167 more than that of the baseline. The extreme gradient boosting method with 500 attributes achieve 0.9628991 F1 score, 0.0418642 more than that of the baseline. We conclude that our approach can improve the precision and recall of the classification.
机译:本文提出了一种用于检测(看不见)病毒文件的监督式机器学习模型。我们的主要重点是静态分析方法。为了找到最佳方法,我们尝试了不同类型的特征提取和三种分类器算法,包括极端梯度增强,随机森林和多层感知器。我们的数据集包含6,319个可执行文件。每个文件都用objdump提取,并用TF-IDF分数排序以找到最佳功能。 F1分数显示的性能比基线略好。具有20个属性的随机森林产生0.9379758 F1分数,比基线高0.0316167。具有500个属性的极端梯度增强方法获得0.9628991 F1分数,比基线高0.0418642。我们得出结论,我们的方法可以提高分类的准确性和召回率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号