Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study

机译：了解跨项目缺陷预测转移学习的自动参数优化：实证研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/konwledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is ‘not difficult’ to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.

机译：数据驱动的缺陷预测在软件工程过程中变得越来越重要。由于来自软件项目的数据并不少见，不足以培训可靠的缺陷预测模型，因此转移学习从其他项目中借用数据/ konwledge，以便于当前项目的模型建设，即跨项目缺陷预测（CPDP），自然是合理的。大多数CPDP技术都涉及两个主要步骤，即转移学习和分类，其中每个都有至少一个参数来调整以实现其最佳性能。这种做法非常适合自动参数优化的目的。但是，缺乏彻底了解自动参数优化对各种CPDP技术的影响是什么。在本文中，我们提出了第一个对62种CPDP技术的影响，其中13个从现有的CPDP文献中选择，而另外49人尚未探讨。我们构建缺陷预测模型超过20个现实世界的软件项目，这些模型具有不同的尺度和特性。我们的研究结果表明：（1）自动参数优化大大提高了具有可管理计算成本的77％CPDP技术的缺陷预测性能。因此，在未来的CPDP研究中需要对这方面的更多努力。（2）转让学习在CPDP中具有最重要的重要性。鉴于简单的计算预算，专注于优化转运学习算法的参数配置更具成本效益（3）CPDP的研究远非成熟，在那里通过制作组合找到更好的替代方案作者：王莹，现有的转移学习和分类技术。这一发现提供了对CPDP技术未来设计的重要见解。

著录项

来源
《IEEE/ACM International Conference on Software Engineering》|2020年|566-577|共12页
会议地点
作者
Ke Li; Zilin Xiang; Tao Chen; Shuo Wang; Kay Chen Tan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Predictive models; Software; Data models; Optimization; Tuning; Software engineering;

机译：培训;预测模型;软件;数据模型;优化;调整;软件工程;

相似文献

外文文献
中文文献
专利

1. Cross-project defect prediction using data sampling for class imbalance learning: an empirical study [J] . Goel Lipika, Sharma Mayank, Khatri Sunil Kumar, International Journal of Parallel, Emergent and Distributed Systems . 2021,第1a2期

机译：使用类别不平衡学习数据采样的跨项目缺陷预测：实证研究
2. An empirical analysis of the statistical learning models for different categories of cross-project defect prediction [J] . Lipika Goel, Mayank Sharma, Sunil Kumar Khatri, International Journal of Computer Aided Engineering and Technology . 2021,第2期

机译：不同类别的交叉项目缺陷预测统计学习模型的实证分析
3. Applicability of an Automated Model and Parameter Selection in the Prediction of Screening-Level PTSD in Danish Soldiers Following Deployment: Development Study of Transferable Predictive Models Using Automated Machine Learning [J] . Karen-Inge Karstoft, Ioannis Tsamardinos, Kasper Eskelund, JMIR Medical Informatics . 2020,第7期

机译：自动模型和参数选择的适用性在部署后丹麦士兵中筛选级接触液的预测：使用自动化机器学习的可转移预测模型的开发研究
4. An Empirical Study on Combining Source Selection and Transfer Learning for Cross-Project Defect Prediction [C] . Wanzhi Wen, Bin Zhang, Xiang Gu, 2019 IEEE 1st International Workshop on Intelligent Bug Fixing . 2019

机译：跨项目缺陷预测的源选择与转移学习相结合的实证研究
5. A Software Metrics Clustering Approach to Cross-Project Defect Prediction [D] . Sezer, Anil. 2019

机译：交叉项目缺陷预测的软件度量聚类方法
6. Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques [O] . Bilal Khan, Rashid Naseem, Muhammad Arif Shah, 2021

机译：医疗保健大数据的软件缺陷预测：机器学习技术的实证评价
7. An Empirical Study on the Effectiveness of Feature Selection for Cross-Project Defect Prediction [O] . Qiao Yu, Junyan Qian, Shujuan Jiang, 2019

机译：交叉项目缺陷预测特征选择有效性的实证研究

Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study

摘要

著录项

相似文献

相关主题

期刊订阅