Dealing with Class Noise in Large Training Datasets for Malware Detection

机译：在大型培训数据集中处理类别噪声以进行恶意软件检测

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents the ways we explored until now for detecting and dealing with the class noise found in large annotated datasets used for training the classifiers that we have previously designed for industrial-scale malware identification. First we established a number of distance-based filtering rules that allow us to identify different "levels'' of potential noise in the training data, and secondly we analysed the effects produced by either removal or "cleaning'' of the potentially-noised records on the performances of our simplest classifiers. We show that a careful distance-based filtering can lead to sensibly better results in malware detection.

机译：本文介绍了迄今为止我们探索的方法，用于检测和处理在大型带注释的数据集中发现的类噪声，这些数据用于训练我们以前设计用于工业规模恶意软件识别的分类器。首先，我们建立了许多基于距离的过滤规则，使我们能够识别训练数据中不同的“潜在”噪声水平；其次，我们分析了去除或“清除”潜在噪声记录所产生的影响最简单的分类器的性能我们表明，基于距离的仔细过滤可以在恶意软件检测中带来明显更好的结果。

著录项

来源
《Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2011 13th International Symposium on》|2011年|p.401-407|共7页
会议地点 Timisoara(RO)
作者
Gavrilut Dragos; Ciortuz Liviu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算复杂性理论;
关键词
class noise; data cleansing; malware detection;

机译：类噪声;数据清理;恶意软件检测;

相似文献

外文文献
中文文献
专利

1. Performance Comparison of Training Datasets for System Call-Based Malware Detection with Thread Information [J] . Yuki KAJIWARA, Junjun ZHENG, Koichi MOURI IEICE transactions on information and systems . 2021,第12期

机译：基于系统呼叫的恶意软件检测的训练数据集的性能比较
2. SVM Training Phase Reduction Using Dataset Feature Filtering for Malware Detection [J] . OKane P., Sezer S., McLaughlin K., IEEE transactions on information forensics and security . 2013,第3期

机译：使用数据集特征过滤进行恶意软件检测的SVM培训阶段减少
3. Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset [J] . Martin Alejandro, Lara-Cabrera Raul, Camacho David Information Fusion . 2019,第期

机译：Android恶意软件通过混合动力检测功能融合和集合分类器：Androptool框架和OmniDroid数据集
4. Dealing with Class Noise in Large Training Datasets for Malware Detection [C] . Gavrilut Dragos, Ciortuz Liviu International Symposium on Symbolic and Numeric Algorithms for Scientific Computing . 2011

机译：处理大型训练数据集中的类噪声以进行恶意软件检测
5. Deep Convolutional Neural Networks for the Classification of the EMBER Malware Dataset [D] . Nallamothu, Anudeep 2018

机译：深度卷积神经网络用于EMBER恶意软件数据集的分类
6. Training Neural Network Classifiers for Medical Decision Making: The Effects of Imbalanced Datasets on Classification Performance [O] . Maciej A. Mazurowski, Piotr A. Habas, Jacek M. Zurada, -1

机译：训练用于医疗决策的神经网络分类器：不平衡数据集对分类性能的影响
7. Are Your Training Datasets Yet Relevant? - An Investigation into the Importance of Timeline in Machine Learning-Based Malware Detection [O] . Allix, Kevin, Bissyande, Tegawendé François D Assise, Klein, Jacques, 2015

机译：您的培训数据集是否相关？ - 基于机器学习的恶意软件检测中时间线重要性的研究

Dealing with Class Noise in Large Training Datasets for Malware Detection

摘要

著录项

相似文献

相关主题

期刊订阅