...
首页> 外文期刊>Computing >Combined oversampling and undersampling method based on slow-start algorithm for imbalanced network traffic
【24h】

Combined oversampling and undersampling method based on slow-start algorithm for imbalanced network traffic

机译:基于慢速启动算法的基于慢速启动算法的组合过采样和欠采样方法

获取原文
获取原文并翻译 | 示例

摘要

Network traffic data basically comprise a major amount of normal traffic data and a minor amount of attack data. Such an imbalance problem in the amounts of the two types of data reduces prediction performance, such as by prediction bias of the minority data and miscalculation of normal data as outliers. To address the imbalance problem, representative sampling methods include various minority data synthesis models based on oversampling. However, as the oversampling method for resolving the imbalance problem involves repeatedly learning the same data, the classification model can overfit the learning data. Meanwhile, the undersampling methods proposed to address the imbalance problem can cause information loss because they remove data. To improve the performance of these oversampling and undersampling approaches, we propose an oversampling ensemble method based on the slow-start algorithm. The proposed combined oversampling and undersampling method based on the slow-start (COUSS) algorithm is based on the congestion control algorithm of the transmission control protocol. Therefore, an imbalanced dataset oversamples until overfitting occurs, based on a minimally applied undersampling dataset. The simulation results obtained using the KDD99 dataset show that the proposed COUSS method improves the F1 score by 8.639%, 6.858%, 5.003%, and 4.074% compared to synthetic minority oversampling technique (SMOTE), borderline-SMOTE, adaptive synthetic sampling, and generative adversarial network oversampling algorithms, respectively. Therefore, the COUSS method can be perceived as a practical solution in data analysis applications.
机译:网络流量数据基本上包括主要的正常流量数据和少量攻击数据。这种不平衡问题在两种类型的数据量降低了预测性能,例如通过少数数据的预测偏差和将正常数据的错误分布作为异常值。为了解决不平衡问题,代表性采样方法包括基于过采样的各种少数数据合成模型。然而,由于用于解决不平衡问题的过采样方法涉及重复学习相同的数据,因此分类模型可以过度使用学习数据。同时,提出解决不平衡问题的欠采样方法可能导致信息丢失,因为它们会删除数据。为了提高这些过采样和欠采样方法的性能,我们提出了一种基于慢速启动算法的过采样精心方法。基于慢速启动(COUS)算法的所提出的组合过采样和欠采样方法基于传输控制协议的拥塞控制算法。因此,基于最小应用的UnderAppling DataSet,直到发生超采样的超采样,直到超采样数据集发生了不平衡的数据集。使用KDD99数据集获得的仿真结果表明,与合成少数群体过采样技术(SMOTE),边界扫描,自适应合成采样相比,拟议的拟拟议方法可提高8.639%,6.858%,6.858%,5.003%和4.074%生成的对抗网络过采样算法。因此,可以在数据分析应用中被认为是众所周知的方法。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号