首页> 外文学位 >People, processes, and products: Case studies in open-source software using complex networks.
【24h】

People, processes, and products: Case studies in open-source software using complex networks.

机译:人员,流程和产品:使用复杂网络的开源软件中的案例研究。

获取原文
获取原文并翻译 | 示例

摘要

Open-source software becomes increasingly popular nowadays. Many startup companies and small business owners choose to adopt open source software packages to meet their daily office computing needs or to build their IT infrastructure. Unlike proprietary software systems, open source software systems usually have a loosely-organized developer collaboration structure. Developers work on their "assignments" on a voluntary basis. Many developers do not physically meet their "co-workers." This unique developer collaboration pattern leads to unique software development process, and hence unique structure of software products. It is those unique characteristics of open source software that motivate this dissertation study. Our research follows the framework of the four key elements of software engineering: Project, People, Process and Product (Jacobson, Booch et al. 1999). This dissertation studies three of the four P's: People, Process and Product.;Due to the large sizes and high complexities of many open source software packages, the traditional analysis methods and measures in software engineering can not be readily leveraged to analyze those software packages. In this dissertation, we adopt complex network theory to perform our analysis on open source software packages, software development process, and the collaboration among software developers. We intend to discover some common characteristics that are shared by different open source software packages, and provide a possible explanation of the development process of those software products. Specifically we represent real world entities, such as open source software source code or developer collaborations, with networks composed of inter-connected vertices. We then leverage the topological metrics that have been established in complex network theory to analyze those networks. We also propose our own random network growth model to illustrate open source software development processes. Our research results can be potentially used by software practitioners who are interested to develop high quality software products and reduce the risks in the development process.;Chapter 1 is an introduction of the dissertation's structure and research scope. We aim at studying open source software with complex networks. The details of the 4-P framework will be introduced in that chapter.;Chapter 2 analyzes five C-language based open source software packages by leveraging function dependency networks. That chapter calculates the topological measures of the dependency networks extracted from software source code.;Chapter 3 analyzes the collaborative relationship among open source software developers. We extract developer's co-working data out of two software bug fixing data sets. Again by leveraging complex network theory, we find out a number of topological characteristics of the software developer networks, such as the scale-free property. We also realize the topological differences between from the bug side and from the developer side for the extracted bipartite networks.;Chapter 4 is to compare two widely adopted clustering coefficient definitions, the one proposed by Watts and Strogatz, the other by Newman. The analytical similarities and differences between the two clustering coefficient definitions provide useful guidance to the proposal of the random network growth model that is presented in the next chapter.;Chapter 5 aims to characterize the open source software development process. We propose a two-phase network growth model to illustrate the software development process. Our model describes how different software source code units interconnect as the size of the software grows. A case study was performed by using the same five open source software packages that have been adopted in Chapter 2. The empirical results demonstrate that our model provides a possible explanation on the process of how open source software products are developed.;Chapter 6 concludes the dissertation and highlights the possible future research directions.
机译:开源软件在当今变得越来越流行。许多新兴公司和小型企业主选择采用开源软件包来满足其日常办公计算需求或构建其IT基础结构。与专有软件系统不同,开源软件系统通常具有松散组织的开发人员协作结构。开发人员自愿进行“任务”。许多开发人员实际上没有遇到他们的“同事”。这种独特的开发人员协作模式导致了独特的软件开发过程,并因此产生了独特的软件产品结构。正是开源软件的那些独特特性激发了本论文的研究。我们的研究遵循软件工程四个关键要素的框架:项目,人员,过程和产品(Jacobson,Booch等人,1999年)。本文研究了四个P中的三个:人员,过程和产品。由于许多开源软件包的规模大且复杂性高,因此无法轻松利用软件工程中的传统分析方法和措施来分析这些软件包。本文采用复杂的网络理论对开源软件包,软件开发过程以及软件开发人员之间的协作进行了分析。我们打算发现不同的开源软件包共有的一些共同特征,并为这些软件产品的开发过程提供可能的解释。具体来说,我们代表现实世界的实体,例如开源软件源代码或开发人员协作,以及由相互连接的顶点组成的网络。然后,我们利用复杂网络理论中已建立的拓扑度量来分析那些网络。我们还提出了我们自己的随机网络增长模型,以说明开源软件开发过程。有兴趣开发高质量软件产品并减少开发过程中风险的软件从业人员可以利用我们的研究结果。第1章是论文的结构和研究范围的介绍。我们旨在研究具有复杂网络的开源软件。 4-P框架的详细信息将在该章中介绍。第二章通过利用功能依赖网络分析了五个基于C语言的开源软件包。该章计算从软件源代码中提取的依赖网络的拓扑度量。第三章分析了开源软件开发人员之间的协作关系。我们从两个软件错误修复数据集中提取开发人员的协同工作数据。同样,通过利用复杂的网络理论,我们发现了软件开发人员网络的许多拓扑特征,例如无标度属性。我们还认识到了提取的二分网络在错误方面和开发人员方面之间的拓扑差异。第4章是比较两种广泛采用的聚类系数定义,一种由Watts和Strogatz提出,另一种由Newman提出。两种聚类系数定义之间的分析异同为下一章中提出的随机网络增长模型的建议提供了有用的指导。第五章旨在描述开源软件开发过程。我们提出了一个两阶段的网络增长模型来说明软件开发过程。我们的模型描述了随着软件大小的增长,不同的软件源代码单元如何互连。使用与第2章相同的五个开源软件包进行了案例研究。实证结果表明,我们的模型为开源软件产品的开发过程提供了可能的解释。第六章总结了本章。论文并突出了未来可能的研究方向。

著录项

  • 作者

    Ma, Jian James.;

  • 作者单位

    The University of Arizona.;

  • 授予单位 The University of Arizona.;
  • 学科 Information Technology.;Information Science.;Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 136 p.
  • 总页数 136
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号