首页>
外国专利>
Method to indentify anomalous data using cascaded K-Means clustering and an ID3 decision tree
Method to indentify anomalous data using cascaded K-Means clustering and an ID3 decision tree
展开▼
机译:利用级联K均值聚类和ID3决策树识别异常数据的方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
The invention is a computer implemented technique for id entifying anomalous data in a data set. The method uses cascaded k-Means clustering and the ID3 decision tree learning methods to characterize a training data set having data points with known characterization. The k-Means clustering method first partitions the training instances into k clusters using Euclidean distance similarity. On each training cluster, representing a density region of normal or anomaly instances, the invention builds an ID3 decision tree. The decision tree on each cluster refines the decision boundaries by learning the sub-groups within the cluster. A test data point is then subjected to the clustering and decision trees constructed form the training instances. To obtain a final decision on classification, the decisions of the k-Means and ID3 methods are combined using rules: (1) the Nearest-neighbor rule, and (2) the Nearest-consensus rule.
展开▼