首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods
【24h】

K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods

机译:K-Means + ID3:一种通过级联K-Means聚类和ID3决策树学习方法进行监督异常的新方法

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we present "k-means+ID3", a method to cascade k-means clustering and the ID3 decision tree learning methods for classifying anomalous and normal activities in a computer network, an active electronic circuit, and a mechanical mass-beam system. The k-means clustering method first partitions the training instances into k clusters using Euclidean distance similarity. On each cluster, representing a density region of normal or anomaly instances, we build an ID3 decision tree. The decision tree on each cluster refines the decision boundaries by learning the subgroups within the cluster. To obtain a final decision on classification, the decisions of the k-means and ID3 methods are combined using two rules: 1) the nearest-neighbor rule and 2) the nearest-consensus rule. We perform experiments on three data sets: 1) network anomaly data (NAD), 2) Duffing equation data (DED), and 3) mechanical system data (MSD), which contain measurements from three distinct application domains of computer networks, an electronic circuit implementing a forced Duffing equation, and a mechanical system, respectively. Results show that the detection accuracy of the k-means+ID3 method is as high as 96.24 percent at a false-positive-rate of 0.03 percent on NAD; the total accuracy is as high as 80.01 percent on MSD and 79.9 percent on DED
机译:在本文中,我们介绍了“ k-means + ID3”,一种用于对k-means聚类进行级联的方法,以及ID3决策树学习方法,用于对计算机网络,有源电子电路和机械质量-梁系统。 k均值聚类方法首先使用欧氏距离相似度将训练实例划分为k个聚类。在代表正常或异常实例的密度区域的每个群集上,我们构建一个ID3决策树。每个群集上的决策树通过学习群集内的子组来细化决策边界。为了获得关于分类的最终决策,k均值和ID3方法的决策使用两个规则进行组合:1)最近邻居规则和2)最近共识规则。我们对三个数据集进行实验:1)网络异常数据(NAD),2)Duffing方程数据(DED)和3)机械系统数据(MSD),其中包含来自计算机网络三个不同应用领域的测量值,一个电子实现强制Duffing方程的电路和机械系统。结果表明,k-means + ID3方法在NAD上的假阳性率为0.03%时,检测准确率高达96.24%。 MSD的总精度高达80.01%,DED的总精度高达79.9%

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号