The Relationship between Commit Message Detail and Defect Proneness in Java Projects on GitHub

机译：GitHub上Java项目中的提交消息详细信息和缺陷倾向之间的关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Just-In-Time (JIT) defect prediction models aim to predict the commits that will introduce defects in the future. Traditionally, JIT defect prediction models are trained using metrics that are primarily derived from aspects of the code change itself (e.g., the size of the change, the author's prior experience). In addition to the code that is submitted during a commit, authors write commit messages, which describe the commit for archival purposes. It is our position that the level of detail in these commit messages can provide additional explanatory power to JIT defect prediction models. Hence, in this paper, we analyze the relationship between the defect proneness of commits and commit message volume (i.e., the length of the commit message) and commit message content (approximated using spam filtering technology). Through analysis of JIT models that were trained using 342 GitHub repositories, we find that our JIT models outperform random guessing models, achieving AUC and Brier scores that range between 0.63-0.96 and 0.01-0.21, respectively. Furthermore, our metrics that are derived from commit message detail provide a statistically significant boost to the explanatory power to the JIT models in 43%-80% of the studied systems, accounting for up to 72% of the explanatory power. Future JIT studies should consider adding commit message detail metrics.

机译：即时（JIT）缺陷预测模型旨在预测将来会引入缺陷的提交。传统上，JIT缺陷预测模型是使用主要从代码更改本身的方面（例如，更改的大小，作者的先前经验）得出的指标进行训练的。除了在提交过程中提交的代码外，作者还编写提交消息，这些消息描述了出于存档目的的提交。我们的立场是，这些提交消息中的详细程度可以为JIT缺陷预测模型提供额外的解释能力。因此，在本文中，我们分析了提交的缺陷倾向性与提交消息量（即提交消息的长度）和提交消息内容（使用垃圾邮件过滤技术近似）之间的关系。通过对使用342个GitHub存储库训练的JIT模型的分析，我们发现我们的JIT模型优于随机猜测模型，分别实现了AUC和Brier得分在0.63-0.96和0.01-0.21之间。此外，在43％-80％的研究系统中，从落实消息详细信息派生的指标为JIT模型的解释能力提供了统计上显着的提升，占解释能力的72％。将来的JIT研究应考虑添加落实消息详细信息度量标准。

著录项

来源
《Working Conference on Mining Software Repositories》|2016年|496-499|共4页
会议地点
作者
Jacob G. Barnett; Charles K. Gathuru; Luke S. Soldano; Shane McIntosh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Measurement; Solid modeling; Analytical models; Computational modeling; Predictive models; Java; Bayes methods;

机译：测量;实体建模;分析模型;计算模型;预测模型; Java;贝叶斯方法;

相似文献

外文文献
中文文献
专利

1. On the Nature of Merge Conflicts: A Study of 2,731 Open Source Java Projects Hosted by GitHub [J] . Ghiotto Gleiph, Murta Leonardo, Barros Marcio, IEEE Transactions on Software Engineering . 2020,第8期

机译：关于合并冲突的性质：GitHub托管2,731个开源Java项目的研究
2. Semi-supervised Heterogeneous Defect Prediction with Open-source Projects on GitHub [J] . Ying Sun, Xiao-Yuan Jing, Fei Wu, International journal of software engineering and knowledge engineering . 2021,第6期

机译：在GitHub上的开源项目半监督异构缺陷预测
3. Seguimiento de proyectos de programacin. Una aplicacin de GitHub en la educacin - Programming Projects Monitoring. Using Github on Education [J] . Javier Salazar Zrate, Blanca Hidalgo Ponce, Narcisa Salazar Alvarez, RECIBE . 2016,第3期

机译：监视编程项目。 GitHub中的教育应用程序-编程项目监控。在教育上使用Github
4. The Relationship between Commit Message Detail and Defect Proneness in Java Projects on GitHub [C] . Jacob G. Barnett, Charles K. Gathuru, Luke S. Soldano, Working Conference on Mining Software Repositories . 2016

机译：GITHUB上Java项目中提交消息详细信息与缺陷的关系
5. Fatigue-prone steel bridge details: Investigation and recommended repairs. [D] . Zhao, Yuan. 2003

机译：容易疲劳的钢桥细节：调查并建议维修。
6. Testing Messages to Encourage Discussion of Clinical Trials among Cancer Survivors and Their Physicians: Examining Monitoring Style and Message Detail [O] . Lindsay R. Duncan, Amy E. Latimer, Elizabeth Pomery, -1

机译：测试消息以鼓励讨论癌症幸存者及其医师的临床试验：检查监测风格和信息细节
7. License Usage and Changes: A Large-Scale Study of Java Projects on GitHub [O] . Christopher Vendome, Mario Linares-Vasquez, Gabriele Bavota, 2015

机译：许可证用法和变化：GitHub上的Java项目大规模研究

The Relationship between Commit Message Detail and Defect Proneness in Java Projects on GitHub

摘要

著录项

相似文献

相关主题

期刊订阅