首页> 外文会议>IEEE International Congress on Big Data >Mining incomplete data with many attribute-concept values and 'do not care' conditions
【24h】

Mining incomplete data with many attribute-concept values and 'do not care' conditions

机译:挖掘具有许多属性概念值和“无关”条件的不完整数据

获取原文

摘要

In this paper we present novel experimental results comparing two interpretations of missing attribute values: attribute-concept values and "do not care" conditions. Experiments were conducted on 12 data sets with many missing attribute values using the MLEM2 rule induction system. In the experiments, three kinds of probabilistic approximations were used: singleton, subset and concept; with the error rate of the induced rules evaluated by ten-fold cross validation. The results of the experiments compared two interpretations of missing values, attribute-concept values and "do not care" conditions, finding the best result among the three probabilistic approximations. The outcomes show that for two cases the better performance was accomplished using attribute-concept values, for one case the better performance was accomplished using "do not care" conditions. For remaining three cases the difference in performance was not statistically significant (5% significance level).
机译:在本文中,我们提供了新颖的实验结果,比较了缺失属性值的两种解释:属性概念值和“无关”条件。使用MLEM2规则归纳系统对12个具有许多缺失属性值的数据集进行了实验。在实验中,使用了三种概率近似:单例,子集和概念;通过十次交叉验证评估得出的规则的错误率。实验结果比较了缺失值,属性概念值和“无关”条件的两种解释,在三种概率近似中找到了最佳结果。结果表明,在两种情况下,使用属性概念值可以实现更好的性能,在一种情况下,使用“不在乎”条件可以实现更好的性能。对于其余三个案例,绩效差异在统计学上不显着(显着性水平为5%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号