Data stream mining for predicting software build outcomes using source code metrics

Jacqui Finlay; Russel Pears; Andy M. Connor

首页> 外文期刊>Information and software technology >Data stream mining for predicting software build outcomes using source code metrics

【24h】

Data stream mining for predicting software build outcomes using source code metrics

机译：数据流挖掘，可使用源代码指标来预测软件构建结果

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Context: Software development projects involve the use of a wide range of tools to produce a software artifact. Software repositories such as source control systems have become a focus for emergent research because they are a source of rich information regarding software development projects. The mining of such repositories is becoming increasingly common with a view to gaining a deeper understanding of the development process. Objective: This paper explores the concepts of representing a software development project as a process that results in the creation of a data stream. It also describes the extraction of metrics from the Jazz repository and the application of data stream mining techniques to identify useful metrics for predicting build success or failure. Method: This research is a systematic study using the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift by applying the Massive Online Analysis (MOA) tool. Results: The results indicate that only a relatively small number of the available measures considered have any significance for predicting the outcome of a build over time. These significant measures are identified and the implication of the results discussed, particularly the relative difficulty of being able to predict failed builds. The Hoeffding Tree approach is shown to produce a more stable and robust model than traditional data mining approaches. Conclusion: Overall prediction accuracies of 75% have been achieved through the use of the Hoeffding Tree classification method. Despite this high overall accuracy, there is greater difficulty in predicting failure than success. The emergence of a stable classification tree is limited by the lack of data but overall the approach shows promise in terms of informing software development activities in order to minimize the chance of failure.

机译：背景：软件开发项目涉及使用各种工具来产生软件工件。诸如源代码控制系统之类的软件存储库已成为新兴研究的重点，因为它们是有关软件开发项目的丰富信息的来源。为了更深入地了解开发过程，对此类存储库的挖掘正变得越来越普遍。目标：本文探讨了将软件开发项目表示为导致创建数据流的过程的概念。它还描述了从Jazz存储库中提取指标以及数据流挖掘技术的应用，以识别用于预测构建成功或失败的有用指标。方法：本研究是一项系统研究，使用Hoeffding树分类方法与自适应滑动窗口（ADWIN）方法结合使用大规模在线分析（MOA）工具检测概念漂移。结果：结果表明，考虑到的可用措施中，只有相对少数几个对预测构建结果随时间推移具有任何意义。确定了这些重要措施并讨论了结果的含义，尤其是能够预测失败构建的相对难度。与传统的数据挖掘方法相比，Hoeffding Tree方法显示出可以产生更稳定，更可靠的模型。结论：通过使用霍夫丁树分类方法，总体预测准确率达到了75％。尽管总体准确性很高，但预测失败要比成功要困难得多。稳定的分类树的出现受到数据缺乏的限制，但总体而言，该方法在通知软件开发活动以最大程度地降低失败机会方面显示出了希望。

著录项

来源
《Information and software technology》 |2014年第2期|183-198|共16页
作者
Jacqui Finlay; Russel Pears; Andy M. Connor;
展开▼
作者单位

Auckland University of Technology, Auckland, New Zealand;

Auckland University of Technology, Auckland, New Zealand;

Auckland University of Technology, Auckland, New Zealand;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Data stream mining; Concept drift detection; Hoeffding tree; Jazz; Software metrics; Software repositories;

机译：数据流挖掘;概念漂移检测;霍夫丁树;爵士乐;软件指标;软件仓库;

相似文献

外文文献
中文文献
专利

1. Predicting Software Build Failure Using Source Code Metrics [J] . Andy M. Connor, Jacqui Finlay International Journal of Information and Communication Technology Research . 2011,第5期

机译：使用源代码指标预测软件构建失败
2. Mining software change data stream to predict changeability of classes of object?oriented software system [J] . Anshu Parashar, Jitender Kumar Chhabra Evolving Systems . 2016,第2期

机译：挖掘软件更改数据流以预测面向对象的软件系统类别的可更改性
3. Predicting different levels of the unit testing effort of classes using source code metrics: a multiple case study on open-source software [J] . Fadel Toure, Mourad Badri, Luc Lamontagne Innovations in Systems and Software Engineering . 2018,第1期

机译：使用源代码指标预测类的不同层次测试努力：开源软件的多个案例研究
4. Software maintainability prediction by data mining of software code metrics [C] . Kaur Arvinder, Kaur Kamaldeep, Pathak Kaushal 2014 International Conference on Data Mining and Intelligent Computing . 2014

机译：通过软件代码指标的数据挖掘来预测软件可维护性
5. Engaging developers in open source software projects: Harnessing social and technical data mining to improve software development. [D] . Carlson, Patrick Eric. 2015

机译：使开发人员参与开源软件项目：利用社交和技术数据挖掘来改善软件开发。
6. Web service QoS prediction using improved software source code metrics [O] . Sarathkumar Rangarajan, Huai Liu, Hua Wang 2020

机译：使用改进的软件源代码指标预测Web服务QoS预测
7. Data stream mining for predicting software build outcomes using source code metrics [O] . Finlay, Jacqui, Pears, Russel, Connor, Andy M. 2016

机译：用于使用源预测软件构建结果的数据流挖掘代码指标

Data stream mining for predicting software build outcomes using source code metrics

摘要

著录项

相似文献

相关主题

期刊订阅