首页> 外文会议>Asia-Pacific Software Engineering Conference >Filter-INC: Handling Effort-Inconsistency in Software Effort Estimation Datasets
【24h】

Filter-INC: Handling Effort-Inconsistency in Software Effort Estimation Datasets

机译:Filter-INC:处理软件工作量估算数据集中的工作量不一致

获取原文

摘要

Effort-inconsistency is a situation where historical software project data used for software effort estimation (SEE) are contaminated by many project cases with similar characteristics but are completed with significantly different amount of effort. Using these data for SEE generally produces inaccurate results; however, an effective technique for its handling is yet made to be available. This study approaches the problem differently from common solutions, where available techniques typically attempt to remove every project case they have detected as outliers. Instead, we hypothesize that data inconsistency is caused by only a few deviant project cases and any attempt to remove those other cases will result in reduced accuracy, largely due to loss of useful information and data diversity. Filter-INC (short for Filtering technique for handling effort-INConsistency in SEE datasets) implements the hypothesis to decide whether a project case being detected by any existing technique should be subject to removal. The evaluation is carried out by comparing the performance of 2 filtering techniques between before and after having Filter-INC applied. The results produced from 8 real-world datasets together with 3 machine-learning models, and evaluated by 4 performance measures show a significant accuracy improvement at the confident interval of 95%. Based on the results, we recommend our proposed hypothesis as an important instrument to design a data preprocessing technique for handling effort-inconsistency in SEE datasets, definitely an important step forward in preprocessing data for a more accurate SEE model.
机译:工作量不一致是一种情况,其中用于软件工作量估算(SEE)的历史软件项目数据被许多具有相似特征的项目案例所污染,但完成时却产生了明显不同的工作量。将这些数据用于SEE通常会产生不准确的结果;但是,尚有一种有效的处理方法。这项研究对问题的解决方法不同于通常的解决方案,在解决方案中,可用的技术通常试图消除他们检测到的所有异常情况。取而代之的是,我们假设数据不一致仅是由少数几个异常的项目案例引起的,而删除这些其他案例的任何尝试都会导致准确性降低,这主要是由于有用信息的丢失和数据多样性的缘故。 Filter-INC(SEE数据集中用于处理工作量-INConsistency的过滤技术的缩写)实施该假设,以决定是否应删除通过任何现有技术检测到的项目案例。通过比较在应用Filter-INC之前和之后的两种过滤技术的性能来进行评估。由8个真实世界的数据集以及3个机器学习模型产生的结果,并通过4个性能指标进行了评估,结果显示,置信区间为95%时,准确性显着提高。根据结果​​,我们建议提出的假设作为设计用于处理SEE数据集中工作量不一致的数据预处理技术的重要工具,这无疑是在为更精确的SEE模型预处理数据方面迈出的重要一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号