首页> 外文会议>International Conference on Software Quality, Reliability and Security >Which Metrics Should Researchers Use to Collect Repositories: An Empirical Study
【24h】

Which Metrics Should Researchers Use to Collect Repositories: An Empirical Study

机译:研究人员应该使用哪些指标来收集存储库:实证研究

获取原文

摘要

GitHub is a huge publicly available development platform for hosting a version control system based on Git; software developers prefer to host their various software projects in GitHub. Therefore researchers who are interested in mining software repository frequently use GitHub to collect software projects as datasets. GitHub provides us with repository metrics such as popularity, contribution, and interest. We believe that such metrics are related to the quality of software; we use them to opt for studied repositories according to our research purpose. However, to the best of our knowledge, nobody has any evidence to support this assumption.Our main purpose is to provide researchers who study software quality, especially issue management, with repository metrics to select appropriate repositories for their studies. In this paper, we study the relationship between the characteristics of the issue pages of repositories that are selected by repository metrics in order to figure out the best repository metric to select proper repositories. The following findings are the highlights of our study: (1) The number of contributors that indicates the number of developers who contribute to a GitHub repository can be used to select the repositories having issue pages that are well-maintained. More specifically, such issue pages include more issues and in which developers use the labels more frequently rather than those that are selected by other metrics. (2) The number of dependencies opts for the repositories that have fewer issues and in which developers use the labels less often rather than those that are selected by other metrics.
机译:GitHub是一个庞大的公开可用的开发平台,用于托管基于Git的版本控制系统;软件开发人员愿意在GitHub中托管各种软件项目。因此,对挖掘软件存储库感兴趣的研究人员经常使用GitHub来收集软件项目作为数据集。 GitHub向我们提供存储库度量,例如人气,贡献和兴趣。我们认为,这些指标与软件质量有关;我们使用它们根据我们的研究目的选择研究的存储库。然而,据我们所知,没有人有任何证据支持这一假设。我们的主要目的是为研究软件质量,特别是问题管理的研究人员提供存储库指标,为他们的学习选择适当的存储库。在本文中,我们研究了存储库度量标准选择的存储库问题页面之间的关系,以确定选择正确的存储库的最佳存储库度量标准。以下发现是我们的研究的亮点:(1)指示为GitHub存储库提供贡献的开发人员数量的贡献者的数量可用于选择具有良好维护的页面的存储库。更具体地说,这些问题页面包括更多问题,并且开发人员更频繁地使用标签而不是由其他度量选择的问题。 (2)依赖性的数量选择具有较少问题的存储库,其中开发人员通常不经常使用标签而不是由其他指标选择的那些。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号