【24h】

Case and Feature Subset Selection in Case- Based Software Project Effort Prediction

机译:基于案例的软件项目工作量预测中的案例和特征子集选择

获取原文
获取原文并翻译 | 示例

摘要

Prediction systems adopting a case-based reasoning (CBR) approach have been widely advocated. However, as with most machine learning techniques, feature and case subset selection can be extremely influential on the quality of the predictions generated. Unfortunately, both are NP-hard search problems which are intractable for non-trivial data sets. Using all features frequently leads to poor prediction accuracy and pre-processing methods (filters) have not generally been effective. In this paper we consider two different real world project effort data sets. We describe how using simple search techniques, such as hill climbing and sequential selection, can achieve major improvements in accuracy. We conclude that, for our data sets, forward sequential selection, for features, followed by backward sequential selection, for cases, is the most effective approach when exhaustive searching is not possible.
机译:人们广泛提倡采用基于案例的推理(CBR)方法的预测系统。但是,与大多数机器学习技术一样,特征和案例子集的选择可能对生成的预测质量产生极大影响。不幸的是,这两个都是NP难搜索问题,对于非平凡的数据集来说这是很难解决的。经常使用所有功能会导致较差的预测准确性,并且预处理方法(过滤器)通常无效。在本文中,我们考虑了两个不同的现实世界项目工作量数据集。我们描述了使用简单的搜索技术(例如爬坡和顺序选择)如何在准确性上实现重大改进。我们得出的结论是,对于我们的数据集,在无法进行详尽搜索的情况下,对特征进行前向顺序选择,然后对案例进行后向顺序选择是最有效的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号