首页> 外文会议>IEEE International Congress on Big Data >Mining incomplete data with many attribute-concept values and 'do not care' conditions
【24h】

Mining incomplete data with many attribute-concept values and 'do not care' conditions

机译:使用许多属性概念值和“不关心”条件的挖掘不完整的数据

获取原文

摘要

In this paper we present novel experimental results comparing two interpretations of missing attribute values: attribute-concept values and "do not care" conditions. Experiments were conducted on 12 data sets with many missing attribute values using the MLEM2 rule induction system. In the experiments, three kinds of probabilistic approximations were used: singleton, subset and concept; with the error rate of the induced rules evaluated by ten-fold cross validation. The results of the experiments compared two interpretations of missing values, attribute-concept values and "do not care" conditions, finding the best result among the three probabilistic approximations. The outcomes show that for two cases the better performance was accomplished using attribute-concept values, for one case the better performance was accomplished using "do not care" conditions. For remaining three cases the difference in performance was not statistically significant (5% significance level).
机译:在本文中,我们提出了小说实验结果比较了对缺少属性值的两个解释:属性 - 概念值和“不关心”条件。使用MLEM2规则感应系统在12个数据集中进行了在12个数据集中进行了实验。在实验中,使用了三种概率近似:单身,子集和概念;通过十倍交叉验证评估的诱导规则的错误率。实验结果比较了缺失值,属性 - 概念值和“不关心”条件的两个解释,在三个概率近似下找到最佳结果。结果表明,对于两种情况,使用属性 - 概念值完成更好的性能,对于一种情况,使用“不关心”条件完成更好的性能。为了剩下三种情况,性能差异没有统计学意义(5%的意义水平)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号