首页> 外文期刊>ACM transactions on software engineering and methodology >Boa: Ultra-Large-Scale Software Repository and Source-Code Mining
【24h】

Boa: Ultra-Large-Scale Software Repository and Source-Code Mining

机译:Boa:超大型软件存储库和源代码挖掘

获取原文
获取原文并翻译 | 示例

摘要

In today's software-centric world, ultra-large-scale software repositories, such as SourceForge, GitHub, and Google Code, are the new library of Alexandria. They contain an enormous corpus of software and related information. Scientists and engineers alike are interested in analyzing this wealth of information. However, systematic extraction and analysis of relevant data from these repositories for testing hypotheses is hard, and best left for mining software repository (MSR) experts! Specifically, mining source code yields significant insights into software development artifacts and processes. Unfortunately, mining source code at a large scale remains a difficult task. Previous approaches had to either limit the scope of the projects studied, limit the scope of the mining task to be more coarse grained, or sacrifice studying the history of the code. In this article we address mining source code: (a) at a very large scale; (b) at a fine-grained level of detail; and (c) with full history information. To address these challenges, we present domain-specific language features for source-code mining in our language and infrastructure called Boa. The goal of Boa is to ease testing MSR-related hypotheses. Our evaluation demonstrates that Boa substantially reduces programming efforts, thus lowering the barrier to entry. We also show drastic improvements in scalability.
机译:在当今以软件为中心的世界中,超大型软件存储库(例如SourceForge,GitHub和Google Code)是Alexandria的新库。它们包含大量软件和相关信息。科学家和工程师都对分析大量信息感兴趣。但是,很难从这些存储库中对相关数据进行系统提取和分析以测试假设,这是采矿软件存储库(MSR)专家的最佳选择!具体来说,挖掘源代码可以对软件开发工件和过程产生重要的见解。不幸的是,大规模挖掘源代码仍然是一项艰巨的任务。以前的方法要么必须限制所研究项目的范围,要么将挖掘任务的范围限制为更粗糙,要么牺牲研究代码的历史。在本文中,我们讨论挖掘源代码:(a)规模很大; (b)细致的细节; (c)具有完整的历史记录信息。为了解决这些挑战,我们提供了特定领域的语言功能,用于以我们的语言和称为Boa的基础结构进行源代码挖掘。 Boa的目标是简化与MSR相关的假设的检验。我们的评估表明,Boa大大减少了编程工作,从而降低了进入门槛。我们还显示了可伸缩性方面的显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号