Filter-INC: Handling Effort-Inconsistency in Software Effort Estimation Datasets

机译：Filter-INC：处理软件工作量估算数据集中的工作量不一致

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Effort-inconsistency is a situation where historical software project data used for software effort estimation (SEE) are contaminated by many project cases with similar characteristics but are completed with significantly different amount of effort. Using these data for SEE generally produces inaccurate results; however, an effective technique for its handling is yet made to be available. This study approaches the problem differently from common solutions, where available techniques typically attempt to remove every project case they have detected as outliers. Instead, we hypothesize that data inconsistency is caused by only a few deviant project cases and any attempt to remove those other cases will result in reduced accuracy, largely due to loss of useful information and data diversity. Filter-INC (short for Filtering technique for handling effort-INConsistency in SEE datasets) implements the hypothesis to decide whether a project case being detected by any existing technique should be subject to removal. The evaluation is carried out by comparing the performance of 2 filtering techniques between before and after having Filter-INC applied. The results produced from 8 real-world datasets together with 3 machine-learning models, and evaluated by 4 performance measures show a significant accuracy improvement at the confident interval of 95%. Based on the results, we recommend our proposed hypothesis as an important instrument to design a data preprocessing technique for handling effort-inconsistency in SEE datasets, definitely an important step forward in preprocessing data for a more accurate SEE model.

机译：工作量不一致是一种情况，其中用于软件工作量估算（SEE）的历史软件项目数据被许多具有相似特征的项目案例所污染，但完成时却产生了明显不同的工作量。将这些数据用于SEE通常会产生不准确的结果;但是，尚有一种有效的处理方法。这项研究对问题的解决方法不同于通常的解决方案，在解决方案中，可用的技术通常试图消除他们检测到的所有异常情况。取而代之的是，我们假设数据不一致仅是由少数几个异常的项目案例引起的，而删除这些其他案例的任何尝试都会导致准确性降低，这主要是由于有用信息的丢失和数据多样性的缘故。 Filter-INC（SEE数据集中用于处理工作量-INConsistency的过滤技术的缩写）实施该假设，以决定是否应删除通过任何现有技术检测到的项目案例。通过比较在应用Filter-INC之前和之后的两种过滤技术的性能来进行评估。由8个真实世界的数据集以及3个机器学习模型产生的结果，并通过4个性能指标进行了评估，结果显示，置信区间为95％时，准确性显着提高。根据结果，我们建议提出的假设作为设计用于处理SEE数据集中工作量不一致的数据预处理技术的重要工具，这无疑是在为更精确的SEE模型预处理数据方面迈出的重要一步。

著录项

来源
《Asia-Pacific Software Engineering Conference》|2016年|185-192|共8页
会议地点
作者
Passakorn Phannachitta; Jacky Keung; Kwabena Ebo Bennin; Akito Monden; Kenichi Matsumoto;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Software; Estimation; Software engineering; Information filters; Data models;

机译：软件;估算;软件工程;信息过滤器;数据模型;

相似文献

外文文献
中文文献
专利

1. Analyzing the Stationarity Process in Software Effort Estimation Datasets [J] . Michael Franklin Bosu, Stephen G. MacDonell, Peter A. Whigham International journal of software engineering and knowledge engineering . 2020,第11a12期

机译：分析软件工作估算数据集的实体性过程
2. Experience: Quality Benchmarking of Datasets Used in Software Effort Estimation [J] . MICHAEL F. BOSU, STEPHEN G. MACDONELL ACM journal of data and information quality . 2019,第4期

机译：体验：软件工作中使用的数据集质量基准测试
3. Active Learning and Effort Estimation: Finding the Essential Content of Software Effort Estimation Data [J] . Kocaguneli, Ekrem, Menzies, IEEE Transactions on Software Engineering . 2013,第8期

机译：主动学习和工作量估算：查找软件工作量估算数据的基本内容
4. Filter-INC: Handling Effort-Inconsistency in Software Effort Estimation Datasets [C] . Passakorn Phannachitta, Jacky Keung, Kwabena Ebo Bennin, Asia-Pacific Software Engineering Conference . 2016

机译：Filter-Inc：处理软件工作估算数据集的努力 - 不一致
5. Software effort estimation accuracy: A comparative study of estimations based on software sizing and development methods. [D] . Lafferty, Mark T. 2010

机译：软件工作量估计准确性：基于软件大小和开发方法的估计的比较研究。
6. Software Development Effort Estimation Using Regression Fuzzy Models [O] . Ali Bou Nassif, Mohammad Azzeh, Ali Idri, 2019

机译：基于回归模糊模型的软件开发工作量估算
7. KOMPARASI METODE KOMBINASI SELEKSI FITUR DAN MACHINE LEARNING K-NEAREST NEIGHBOR PADA DATASET LABEL HOURS SOFTWARE EFFORT ESTIMATION [O] . Indra Kurniawan, Ahmad Faiq Abror 2019

机译：比较组合方法选择特征和机器学习K-Collect邻居在Label Label Houth软件工作估算数据集

Filter-INC: Handling Effort-Inconsistency in Software Effort Estimation Datasets

摘要

著录项

相似文献

相关主题

期刊订阅