首页> 外文期刊>Applied Network Science >Evolution of communities of software: using tensor decompositions to compare software ecosystems
【24h】

Evolution of communities of software: using tensor decompositions to compare software ecosystems

机译:软件社区的演变:使用张量分解比较软件生态系统

获取原文
获取外文期刊封面目录资料

摘要

Abstract Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libraries and software packages are open-source and developed in the open on sites such as GitHub , so there is a large amount of data available about these networks. Studying these networks could be of interest to anyone choosing or designing a programming language. In this work, we use tensor factorisation to explore the dynamics of communities of software, and then compare these dynamics between languages on a dataset of approximately 1 million software projects. We hope to be able to inform the debate on software dependencies that has been recently re-ignited by the malicious takeover of the npm package event-stream and other incidents through giving a clearer picture of the structure of software dependency networks, and by exploring how the choices of language designers—for example, in the size of standard libraries, or the standards to which packages are held before admission to a language ecosystem is granted—may have shaped their language ecosystems. We establish that adjusted mutual information is a valid metric by which to assess the number of communities in a tensor decomposition and find that there are striking differences between the communities found across different software ecosystems and that communities do experience large and interpretable changes in activity over time. The differences between the elm and R software ecosystems, which see some communities decline over time, and the more conventional software ecosystems of Python, Java and JavaScript, which do not see many declining communities, are particularly marked.
机译:摘要现代软件开发通常是许多软件作者共同努力的结果,即通过软件库重用和共享代码。现代软件“生态系统”是复杂的社会技术系统,可以表示为多层动态网络。这些库和软件包中的许多都是开源的,并且是在GitHub等网站上公开开发的,因此,有关这些网络的数据很多。选择或设计编程语言的任何人都可能会对研究这些网络感兴趣。在这项工作中,我们使用张量分解来研究软件社区的动态,然后在大约一百万个软件项目的数据集上比较这些语言之间的动态。我们希望能够通过更清晰地了解软件依赖网络的结构,并探索如何通过恶意接管npm软件包事件流和其他事件重新引发关于软件依赖的辩论。语言设计者的选择(例如,在标准库的大小或在准予进入语言生态系统之前保留软件包的标准)的选择可能会影响他们的语言生态系统。我们确定调整后的互信息是评估张量分解中社区数量的有效指标,并且发现在不同软件生态系统中发现的社区之间存在显着差异,并且随着时间的推移,社区的活动确实发生了可解释的大变化。 Elm和R软件生态系统之间的差异(随着社区的发展,社区逐渐减少)和更传统的Python,Java和JavaScript软件生态系统(社区的下降并不多)之间的差异尤为明显。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号