...
首页> 外文期刊>ACM journal of data and information quality >Experience: Quality Benchmarking of Datasets Used in Software Effort Estimation
【24h】

Experience: Quality Benchmarking of Datasets Used in Software Effort Estimation

机译:体验:软件工作中使用的数据集质量基准测试

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Data is a cornerstone of empirical software engineering (ESE) research and practice. Data underpin numerous process and project management activities, including the estimation of development effort and the prediction of the likely location and severity of defects in code. Serious questions have been raised, however, over the quality of the data used in ESE. Data quality problems caused by noise, outliers, and incompleteness have been noted as being especially prevalent. Other quality issues, although also potentially important, have received less attention. In this study, we assess the quality of 13 datasets that have been used extensively in research on software effort estimation. The quality issues considered in this article draw on a taxonomy that we published previously based on a systematic mapping of data quality issues in ESE. Our contributions are as follows: (1) an evaluation of the "fitness for purpose" of these commonly used datasets and (2) an assessment of the utility of the taxonomy in terms of dataset benchmarking. We also propose a template that could be used to both improve the ESE data collection/submission process and to evaluate other such datasets, contributing to enhanced awareness of data quality issues in the ESE community and, in time, the availability and use of higher-quality datasets.
机译:数据是实证软件工程(ESE)研究和实践的基石。数据支持众多进程和项目管理活动,包括估算开发工作和预测代码中可能的缺陷的可能位置和严重程度。然而,在ESE中使用的数据的质量上提出了严重的问题。噪声,异常值和不完整性引起的数据质量问题已被认为特别普遍。其他品质问题,虽然也可能重要,但受到不太关注。在这项研究中,我们评估了在软件努力估算的研究中广泛使用的13个数据集的质量。本文中考虑的质量问题涉及以先前根据ESE中数据质量问题的系统映射发布的分类法。我们的贡献如下:(1)评估这些常用数据集的“健身”和(2)在数据集基准测试方面评估分类法的效用。我们还提出了一个模板,可用于改善ESE数据收集/提交过程,并评估其他这些数据集,有助于提高ESE社区中数据质量问题的认识,及时,更高的可用性和使用优质数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号