首页> 外文会议>International Conference on intelligent science and big data engineering >Deep Metric Learning for Software Change-Proneness Prediction
【24h】

Deep Metric Learning for Software Change-Proneness Prediction

机译:用于软件变更错误预测的深度度量学习

获取原文
获取外文期刊封面目录资料

摘要

Software change-proneness prediction, which predicts whether or not class files in a project will be changed in their next release, can help software developers allocate resources more effectively and reduce software maintenance costs. Previous studies found that change-proneness prediction cannot work well with limited training data, especially for new projects. To address this issue, the cross-project change-pro neness prediction is proposed, which builds a prediction model by using sufficient data form other projects, i.e. the source projects, and predicts the change-prone files in a target project. However, the cross-project prediction is unstable due to the large metric distinction between source projects, leading to a challenge for classifying change-prone files. To improve the cross-project prediction, we propose a Deep Metric Learning (DML) model to minimize such feature distinction before the file classification. Specifically, DML maps files in source projects into a particular space, where files from the same category, e.g. change-prone files, are getting closer while files from different categories are getting further. Besides, we also leverage an over-sampling approach to handle the highly imbalanced dataset for model training. We verify our model on 20 change-proneness datasets, and compare it with 5 cross-project change-proneness models. Results indicate that the proposed model can substantially improve the performance of change-proneness prediction.
机译:软件更改倾向预测可以预测项目的类文件在下一个版本中是否会更改,可以帮助软件开发人员更有效地分配资源并减少软件维护成本。先前的研究发现,变更倾向预测在有限的培训数据中不能很好地工作,尤其是对于新项目。为了解决这个问题,提出了跨项目变更倾向预测,其通过使用来自其他项目(即源项目)的足够数据来建立预测模型,并预测目标项目中的易于变更文件。但是,由于源项目之间的度量标准差异很大,因此跨项目的预测是不稳定的,这导致了对易于更改文件进行分类的挑战。为了改善跨项目的预测,我们提出了深度度量学习(DML)模型,以在文件分类之前最大程度地减少这种特征差异。具体来说,DML将源项目中的文件映射到特定空间,其中来自同一类别的文件(例如,易于更改的文件越来越近,而来自不同类别的文件越来越近。此外,我们还利用过采样方法来处理模型训练中高度不平衡的数据集。我们在20个变更倾向性数据集上验证了我们的模型,并将其与5个跨项目变更倾向性模型进行了比较。结果表明,所提出的模型可以大大改善变化倾向预测的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号