【24h】

Mining class association rules on imbalanced class datasets

机译:挖掘类数据集的挖掘类关联规则

获取原文
获取原文并翻译 | 示例
       

摘要

The task of discovering sets of good rules from imbalanced class datasets may not come easy for existing class association rule mining algorithms. The reason is that they often generate rules belonging to the dominant classes. For example, in medical applications, some symptoms of illness are not popular, and the doctors are very interested in the rules associated with these symptoms. This paper proposes a novel approach for mining class association rules (CARs) in imbalanced class datasets. Firstly, assuming there are n given classes, the training dataset is split into n corresponding groups. For each group, the data is clustered by the k-means algorithm into k groups where the value of k is equal to the number of records of the smallest group. Secondly, we combine all records from the groups after clustering and use the CAR-Miner-Diff algorithm to mine all CARs. We also propose an iterative method to get a highly accurate classifier. From experiments, we show that the proposed approach outperforms existing algorithms while maintaining a large number of useful rules in the classifier.
机译:对于现有的类关联规则挖掘算法,发现来自Imbaled类数据集的良好规则集的任务可能不会容易。原因是他们经常会生成属于主导类别的规则。例如,在医学应用中,一些疾病的症状并不受欢迎,医生对与这些症状相关的规则非常感兴趣。本文提出了一种新的采矿阶级关联规则(CARS)在不平衡类数据集中的新方法。首先,假设存在N给定类,训练数据集被拆分为n个对应的组。对于每个组,数据由k-means算法集群聚集成K组,其中k值等于最小组的记录数。其次,我们在聚类后将所有记录组合在一起,并使用汽车矿工算法来挖掘所有汽车。我们还提出了一种迭代方法来获得高度准确的分类器。从实验中,我们表明所提出的方法优于现有算法,同时在分类器中保持大量有用规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号