首页> 外文会议>IEEE/ACM International Conference on Software Engineering >Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study
【24h】

Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study

机译:了解跨项目缺陷预测转移学习的自动参数优化:实证研究

获取原文

摘要

Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/konwledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is ‘not difficult’ to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.
机译:数据驱动的缺陷预测在软件工程过程中变得越来越重要。由于来自软件项目的数据并不少见,不足以培训可靠的缺陷预测模型,因此转移学习从其他项目中借用数据/ konwledge,以便于当前项目的模型建设,即跨项目缺陷预测(CPDP) ,自然是合理的。大多数CPDP技术都涉及两个主要步骤,即转移学习和分类,其中每个都有至少一个参数来调整以实现其最佳性能。这种做法非常适合自动参数优化的目的。但是,缺乏彻底了解自动参数优化对各种CPDP技术的影响是什么。在本文中,我们提出了第一个对62种CPDP技术的影响,其中13个从现有的CPDP文献中选择,而另外49人尚未探讨。我们构建缺陷预测模型超过20个现实世界的软件项目,这些模型具有不同的尺度和特性。我们的研究结果表明:(1)自动参数优化大大提高了具有可管理计算成本的77%CPDP技术的缺陷预测性能。因此,在未来的CPDP研究中需要对这方面的更多努力。 (2)转让学习在CPDP中具有最重要的重要性。鉴于简单的计算预算,专注于优化转运学习算法的参数配置更具成本效益(3)CPDP的研究远非成熟,在那里通过制作组合找到更好的替代方案作者:王莹,现有的转移学习和分类技术。这一发现提供了对CPDP技术未来设计的重要见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号