...
首页> 外文期刊>Data in Brief >Software reusability dataset based on static analysis metrics and reuse rate information
【24h】

Software reusability dataset based on static analysis metrics and reuse rate information

机译:基于静态分析指标和重用率信息的软件可重用性数据集

获取原文
   

获取外文期刊封面封底 >>

       

摘要

The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset that harnesses information residing in online code hosting facilities and introduces the actual reuse rate of software components as a measure of their reusability. To do so, we have analyzed the most popular projects included in the maven registry and have computed a large number of static analysis metrics at both class and package levels using SourceMeter tool [2] that quantify six major source code properties: complexity, cohesion, coupling, inheritance, documentation and size. For these projects we additionally computed their reuse rate using our self-developed code search engine, AGORA [5]. The generated dataset contains analysis information regarding more than 24,000 classes and 2000 packages, and can, thus, be used as the information basis towards the design and development of data-driven reusability evaluation methodologies. The dataset is related to the research article entitled “Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information” [1].
机译:广泛采用的基于组件的开发范例将适当软件组件的重用视为成功开发软件的主要标准。结果,各种研究工作都致力于评估软件组件可重复使用的程度。先前的工作遵循基于专家的方法,但是,不断增加的开源软件计划允许引入数据驱动的替代方案。在这种情况下,我们生成了一个数据集,该数据集利用在线代码托管设施中的信息,并介绍了软件组件的实际重用率,以衡量它们的可重用性。为此,我们分析了Maven注册表中包含的最受欢迎的项目,并使用SourceMeter工具[2]在类和包级别上计算了大量的静态分析指标,该工具量化了六个主要的源代码属性:复杂性,内聚性,耦合,继承,文档和大小。对于这些项目,我们还使用自己开发的代码搜索引擎AGORA [5]计算了它们的重用率。生成的数据集包含有关24,000多个类和2000个程序包的分析信息,因此可以用作设计和开发数据驱动的可重用性评估方法的信息基础。该数据集与题为“使用静态分析指标和重用率信息来测量软件组件的可重用性”的研究文章有关[1]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号