An empirical comparison of validation methods for software prediction models

Asad Ali; Carmine Gravino

首页> 外文期刊>Journal of software: evolution and process >An empirical comparison of validation methods for software prediction models

【24h】

An empirical comparison of validation methods for software prediction models

机译：软件预测模型验证方法的实证比较

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Model validation methods (e.g., k-fold cross-validation) use historical data to predicthow well an estimation technique (e.g., random forest) performs on the current(or future) data. Studies in the contexts of software development effort estimation(SDEE) and software fault prediction (SFP) have used and investigated differentmodel validation methods. However, no conclusive indications to suggest whichmodel validation method has a major impact on the prediction accuracy and stabilityof estimation techniques. Some studies have investigated model validation methodsspecific to data about either SDEE or SFP. To the best of our knowledge, there is nostudy in the literature, which has employed different validation methods both withSDEE and SFP data. The aim of this paper is to consider different methods (10) fromthe family of cross-validation (CV) and bootstrap validation methods to identifywhich one contributes to obtaining a better prediction accuracy for both types ofdata. We also evaluate which model validation methods allow the estimationtechniques to provide stable performances (i.e., with lower variance). To this aim, wepresent an empirical study involving six datasets from the domain of SDEE and sixdatasets from the SFP domain. The results reveal that repeated 10-fold CV withSDEE and optimistic boot with SFP data are the model validation methods thatprovide a better prediction accuracy in a greater number of experiments than theother model validation methods. Furthermore, a model validation method canimprove the prediction accuracy up to 60% with SDEE data and up to 36% whenemploying SFP data. The analysis also reveals that repeated fivefold CV producesmore stable performances when the experiments are repeated on the same data.

机译：模型验证方法（例如，k折验证）使用历史数据来预测估计技术（例如，随机林）对当前执行的程度如何（或将来）数据。软件开发工作估算的背景下的研究（SDEE）和软件故障预测（SFP）已经使用和调查不同模型验证方法。但是，没有确凿的指示表明这一点模型验证方法对预测准确性和稳定性产生了重大影响估计技术。有些研究已经调查了模型验证方法特定于有关SDEE或SFP的数据。据我们所知，没有文献中的研究，它采用了不同的验证方法SDEE和SFP数据。本文的目的是考虑不同的方法（10）交叉验证（CV）系列和引导验证方法以识别哪一个有助于获得两种类型的更好的预测准确性数据。我们还评估哪些模型验证方法允许估算提供稳定的性能的技术（即，具有较低的方差）。为此，我们提出了一个涉及SDEE领域的六个数据集的实证研究来自SFP域的数据集。结果表明，重复10倍的CV使用SFP数据的SDEE和乐观启动是模型验证方法在大量的实验中提供更好的预测精度其他模型验证方法。此外，模型验证方法可以通过SDEE数据提高预测精度高达60％，何时最高可达36％使用SFP数据。分析还揭示了重复的五倍CV产生在同一数据上重复实验时更稳定的性能。

著录项

来源
《Journal of software: evolution and process 》 |2021年第8期| e2367.1-e2367.38| 共38页
作者
Asad Ali; Carmine Gravino;
展开▼
作者单位

Department of Computer Science Universityof Salerno Fisciano Italy;

Department of Computer Science Universityof Salerno Fisciano Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
model validation methods; software development efforts estimation; software faults prediction;

机译：模型验证方法;软件开发努力估算;软件故障预测;

相似文献

外文文献
中文文献
专利

1. Empirical Validation of Three Software Error Prediction Models [J] . Sukert Alan N. Reliability, IEEE Transactions on . 1979 ,第3期

机译：三种软件错误预测模型的经验验证
2. Comments on ScottKnottESD in response to "An empirical comparison of model validation techniques for defect prediction models" [J] . Steffen Herbold Software Engineering, IEEE Transactions on . 2017 ,第11期

机译：针对“缺陷预测模型的模型验证技术的经验比较”对ScottKnottESD的评论
3. An Empirical Comparison of Model Validation Techniques for Defect Prediction Models [J] . Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, Software Engineering, IEEE Transactions on . 2017 ,第1期

机译：缺陷预测模型的模型验证技术的经验比较
4. Empirically Validating Software Metrics for Risk Prediction Based on Intelligent Methods [C] . Zhihong Xu, Xin Zheng, Ping Guo . 2006

机译：基于智能方法的风险预测软件度量的经验验证
5. Streamflow prediction in the Oak Ridges Moraine area: A software framework, comparison of model regionalization methods, and integration with a Web mapping website [D] . Yuan, Yinhuan 2008

机译：Oak Ridges冰a地区的流量预测：一个软件框架，模型区域化方法的比较以及与Web映射网站的集成
6. Empirical validation of the diffusion model for recognition memory and a comparison of parameter-estimation methods [O] . Nina R. Arnold, Arndt Bröder, Ute J. Bayen -1

机译：识别记忆扩散模型的经验验证和参数估计方法的比较
7. Empirical Evaluation of Model-based Performance Prediction Methods in Software Development [O] . Heiko Koziolek, Viktoria Firus 2008

机译：基于模型的软件开发绩效预测方法实证评价

An empirical comparison of validation methods for software prediction models

摘要

著录项

相似文献

相关主题

期刊订阅