首页> 外文期刊>Knowledge-Based Systems >An effective and efficient approach to classification with incomplete data
【24h】

An effective and efficient approach to classification with incomplete data

机译:一种有效且有效的不完整数据分类方法

获取原文
获取原文并翻译 | 示例

摘要

Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors. Using imputation to transform incomplete data into complete data is a common approach to classification with incomplete data. However, simple imputation methods are often not accurate, and powerful imputation methods are usually computationally intensive. A recent approach to handling incomplete data constructs an ensemble of classifiers, each tailored to a known pattern of missing data. The main advantage of this approach is that it can classify new incomplete instances without requiring any imputation. This paper proposes an improvement on the ensemble approach by integrating imputation and genetic-based feature selection. The imputation creates higher quality training data. The feature selection reduces the number of missing patterns which increases the speed of classification, and greatly increases the fraction of new instances that can be classified by the ensemble. The results of experiments show that the proposed method is more accurate, and faster than previous common methods for classification with incomplete data.
机译:许多现实世界的数据集都不可避免地会出现缺少值的问题。数据不完整的分类必须谨慎处理,因为对缺失值的不充分处理会导致较大的分类错误。使用插补将不完整数据转换为完整数据是对不完整数据进行分类的常用方法。但是,简单的插补方法通常不准确,而强大的插补方法通常需要大量计算。最近处理不完整数据的方法构建了一组分类器,每个分类器都针对已知的丢失数据模式进行了定制。这种方法的主要优势在于,它可以对新的不完整实例进行分类,而无需进行任何插补。通过整合归因和基于遗传的特征选择,本文提出了一种集成方法的改进。归因创建更高质量的训练数据。特征选择减少了丢失模式的数量,这增加了分类的速度,并大大增加了可被集合分类的新实例的比例。实验结果表明,该方法比不常用数据进行分类的方法更为准确,快速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号