首页> 外文会议>Software Metrics, 2005. 11th IEEE International Symposium >Ensemble imputation methods for missing software engineering data
【24h】

Ensemble imputation methods for missing software engineering data

机译:缺少软件工程数据的集成插补方法

获取原文

摘要

One primary concern of software engineering is prediction accuracy. We use datasets to build and validate prediction systems of software development effort, for example. However it is not uncommon for datasets to contain missing values. When using machine learning techniques to build such prediction systems, handling of incomplete data is an important issue for classifier learning since missing values in either training or test set or in both sets can affect prediction accuracy. Many works in machine learning and statistics have shown that combining (ensemble) individual classifiers is an effective technique for improving accuracy of classification. The ensemble strategy is investigated in the context of incomplete data and software prediction. An ensemble Bayesian multiple imputation and nearest neighbour single imputation method, BAMINNSI, is proposed that constructs ensembles based on two imputation methods. Strong results on two benchmark industrial datasets using decision trees support the method.
机译:软件工程的一个主要问题是预测准确性。例如,我们使用数据集来构建和验证软件开发工作量的预测系统。但是,数据集包含缺失值并不少见。当使用机器学习技术来构建这样的预测系统时,不完整数据的处理对于分类器学习来说是一个重要的问题,因为训练或测试集中或两个集中的缺失值都可能影响预测准确性。机器学习和统计中的许多工作表明,组合(集合)单个分类器是提高分类准确性的有效技术。在数据不完整和软件预测的情况下研究集成策略。提出了一种集合贝叶斯多重插补和最近邻单插补方法BAMINNSI,该方法基于两种插补方法构造集合。使用决策树在两个基准工业数据集上的强结果支持了该方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号