首页> 外国专利> Method to indentify anomalous data using cascaded K-Means clustering and an ID3 decision tree

Method to indentify anomalous data using cascaded K-Means clustering and an ID3 decision tree

机译:利用级联K均值聚类和ID3决策树识别异常数据的方法

摘要

The invention is a computer implemented technique for id entifying anomalous data in a data set. The method uses cascaded k-Means clustering and the ID3 decision tree learning methods to characterize a training data set having data points with known characterization. The k-Means clustering method first partitions the training instances into k clusters using Euclidean distance similarity. On each training cluster, representing a density region of normal or anomaly instances, the invention builds an ID3 decision tree. The decision tree on each cluster refines the decision boundaries by learning the sub-groups within the cluster. A test data point is then subjected to the clustering and decision trees constructed form the training instances. To obtain a final decision on classification, the decisions of the k-Means and ID3 methods are combined using rules: (1) the Nearest-neighbor rule, and (2) the Nearest-consensus rule.
机译:本发明是一种计算机实现的技术,用于标识数据集中的异常数据。该方法使用级联的k均值聚类和ID3决策树学习方法来表征具有已知特征的数据点的训练数据集。 k-Means聚类方法首先使用欧氏距离相似度将训练实例划分为k个聚类。在代表正常或异常情况的密度区域的每个训练簇上,本发明建立了ID3决策树。每个群集上的决策树通过学习群集内的子组来细化决策边界。然后,对测试数据点进行聚类,并根据训练实例构建决策树。为了获得关于分类的最终决策,使用以下规则将k-Means方法和ID3方法的决策进行组合:(1)最近邻居规则,以及(2)最近共识规则。

著录项

  • 公开/公告号US7792770B1

    专利类型

  • 公开/公告日2010-09-07

    原文格式PDF

  • 申请/专利权人 VIR V. PHOHA;KIRAN S. BALAGANI;

    申请/专利号US20080072252

  • 发明设计人 VIR V. PHOHA;KIRAN S. BALAGANI;

    申请日2008-02-25

  • 分类号G06N5;

  • 国家 US

  • 入库时间 2022-08-21 18:48:23

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号