首页> 外文OA文献 >People, Processes, and Products: Case Studies in Open-Source Software Using Complex Networks
【2h】

People, Processes, and Products: Case Studies in Open-Source Software Using Complex Networks

机译:人员,流程和产品:使用复杂网络的开源软件中的案例研究

摘要

Open-source software becomes increasingly popular nowadays. Many startup companies and small business owners choose to adopt open source software packages to meet their daily office computing needs or to build their IT infrastructure. Unlike proprietary software systems, open source software systems usually have a loosely-organized developer collaboration structure. Developers work on their "assignments" on a voluntary basis. Many developers do not physically meet their "co-workers." This unique developer collaboration pattern leads to unique software development process, and hence unique structure of software products. It is those unique characteristics of open source software that motivate this dissertation study. Our research follows the framework of the four key elements of software engineering: Project, People, Process and Product (Jacobson, Booch et al. 1999). This dissertation studies three of the four P's: People, Process and Product. Due to the large sizes and high complexities of many open source software packages, the traditional analysis methods and measures in software engineering can not be readily leveraged to analyze those software packages. In this dissertation, we adopt complex network theory to perform our analysis on open source software packages, software development process, and the collaboration among software developers. We intend to discover some common characteristics that are shared by different open source software packages, and provide a possible explanation of the development process of those software products. Specifically we represent real world entities, such as open source software source code or developer collaborations, with networks composed of inter-connected vertices. We then leverage the topological metrics that have been established in complex network theory to analyze those networks. We also propose our own random network growth model to illustrate open source software development processes. Our research results can be potentially used by software practitioners who are interested to develop high quality software products and reduce the risks in the development process. Chapter 1 is an introduction of the dissertation's structure and research scope. We aim at studying open source software with complex networks. The details of the 4-P framework will be introduced in that chapter. Chapter 2 analyzes five C-language based open source software packages by leveraging function dependency networks. That chapter calculates the topological measures of the dependency networks extracted from software source code. Chapter 3 analyzes the collaborative relationship among open source software developers. We extract developer's co-working data out of two software bug fixing data sets. Again by leveraging complex network theory, we find out a number of topological characteristics of the software developer networks, such as the scale-free property. We also realize the topological differences between from the bug side and from the developer side for the extracted bipartite networks. Chapter 4 is to compare two widely adopted clustering coefficient definitions, the one proposed by Watts and Strogatz, the other by Newman. The analytical similarities and differences between the two clustering coefficient definitions provide useful guidance to the proposal of the random network growth model that is presented in the next chapter. Chapter 5 aims to characterize the open source software development process. We propose a two-phase network growth model to illustrate the software development process. Our model describes how different software source code units interconnect as the size of the software grows. A case study was performed by using the same five open source software packages that have been adopted in Chapter 2. The empirical results demonstrate that our model provides a possible explanation on the process of how open source software products are developed. Chapter 6 concludes the dissertation and highlights the possible future research directions.
机译:开源软件在当今变得越来越流行。许多新兴公司和小型企业主选择采用开源软件包来满足其日常办公计算需求或构建其IT基础结构。与专有软件系统不同,开源软件系统通常具有松散组织的开发人员协作结构。开发人员自愿进行“任务”。许多开发人员实际上没有遇到他们的“同事”。这种独特的开发人员协作模式导致了独特的软件开发过程,并因此产生了独特的软件产品结构。正是开源软件的那些独特特性激发了本论文的研究。我们的研究遵循软件工程四个关键要素的框架:项目,人员,过程和产品(Jacobson,Booch等人,1999年)。本文研究了四个P中的三个:人员,过程和产品。由于许多开源软件包的大尺寸和高度复杂性,不能轻易地利用软件工程中的传统分析方法和措施来分析那些软件包。本文采用复杂的网络理论对开源软件包,软件开发过程以及软件开发人员之间的协作进行了分析。我们打算发现不同的开源软件包共有的一些共同特征,并为这些软件产品的开发过程提供可能的解释。具体来说,我们代表现实世界的实体,例如开源软件源代码或开发人员协作,以及由相互连接的顶点组成的网络。然后,我们利用复杂网络理论中已建立的拓扑度量来分析那些网络。我们还提出了我们自己的随机网络增长模型,以说明开源软件开发过程。有兴趣开发高质量软件产品并降低开发过程中的风险的软件从业人员可能会使用我们的研究结果。第1章是论文的结构和研究范围的介绍。我们旨在研究具有复杂网络的开源软件。 4-P框架的详细信息将在该章中介绍。第2章通过利用功能依赖关系网络分析了五个基于C语言的开源软件包。该章计算从软件源代码中提取的依赖网络的拓扑度量。第3章分析了开源软件开发人员之间的协作关系。我们从两个软件错误修复数据集中提取开发人员的协同工作数据。同样,通过利用复杂的网络理论,我们发现了软件开发人员网络的许多拓扑特征,例如无标度属性。我们还认识到,对于提取的二分网络,从错误端到开发者端,拓扑上的差异。第4章将比较两个广泛采用的聚类系数定义,一个由Watts和Strogatz提出,另一个由Newman提出。两种聚类系数定义之间的分析异同为下一章提出的随机网络增长模型的建议提供了有用的指导。第5章旨在描述开源软件开发过程。我们提出了一个两阶段的网络增长模型来说明软件开发过程。我们的模型描述了随着软件大小的增长,不同的软件源代码单元如何互连。通过使用与第2章相同的五个开源软件包进行了案例研究。经验结果表明,我们的模型为开源软件产品的开发过程提供了可能的解释。第六章是论文的总结,并突出了未来可能的研究方向。

著录项

  • 作者

    Ma Jian James;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号