首页> 外文学位 >Functional module identification and function prediction from protein interaction networks.
【24h】

Functional module identification and function prediction from protein interaction networks.

机译:来自蛋白质相互作用网络的功能模块识别和功能预测。

获取原文
获取原文并翻译 | 示例

摘要

Since the completion of sequencing human genome, uncovering the principles of interactions and the functional roles of proteins has been in the spotlight in this post-genomic era. The interactions between proteins provide insights into the underlying mechanisms of biological processes within a cell. The functions of an unknown protein can be postulated on the basis of its interaction evidence with known proteins. The systematic analysis of protein interaction networks has thus become a primary issue in current Bioinformatics research. A wide range of graph theoretic or statistical approaches have attempted to effectively analyze the protein interaction networks. However, they had a limitation in accuracy and efficiency because of the challenges as following. First, the protein-protein interaction data, generated by large-scale high-throughput experiments, are not reliable. Next, the protein interaction networks are typically structured by complex connectivity. Finally, each protein performs multiple functions in varying environmental conditions.;In this dissertation, I explore the quantitative characterization of protein interaction networks based on their unique features such as small-world phenomenon, scale-free distribution and hierarchical modularity. In particular, I focus on accurate, efficient mining of protein interaction networks for the purpose of identifying functional modules and predicting protein functions. A functional module is defined as a maximal set of proteins that participate in the same function. As a pre-process, the network weighting is applied by the integration of functional knowledge from the Gene Ontology database. The semantic similarity and semantic interactivity measures estimate the interaction reliability, which is assigned to the corresponding edge as a weight. These weighted interaction networks can facilitate the accurate analysis for functional knowledge discovery.;I introduce four different approaches for functional module identification and function prediction. First, in the information flow-based approach, I design a novel information flow model that quantifies the propagation of functional information of a protein over the entire complex network. To efficiently implement this model, I propose a dynamic flow simulation algorithm based on random walks. The flow pattern of a protein, generated by this algorithm, indicates its functional impact on the other proteins. Second, the graph restructuring approach retrieves a protein interaction network into a hub-oriented hierarchical structure based on the new definitions of path strength and centrality. This algorithm thus reveals the hierarchically organized functional modules and hubs. Next, the association pattern-based approach searches the functional association patterns that frequently occur in a protein interaction network. I apply the frequent sub-graph mining algorithm to the labeled graph that is generated by assigning the set of functions of a protein into the node label. Finally, graph reduction is the technique of simplifying the complex connecting pattern of a protein interaction network. Using the reduced graph, the modularization is performed by the iterative procedure of the minimum weighted cut and node accumulation.;The generation of protein-protein interaction data is rapidly proceeding, heightening the demand for advances in computational methods to analyze these complex data sets. The approaches presented in this dissertation employ novel, advanced data-mining techniques to discover valuable functional knowledge hidden in the complex protein interaction networks. This knowledge can be the underlying bases of practical applications in Biomedical Science, e.g., disease diagnosis and drug development. Currently, explosive amounts of heterogeneous biological data are being produced. Developing effective integration methods for incorporating such data is a promising direction for future research.
机译:自完成人类基因组测序以来,揭示相互作用的原理和蛋白质的功能作用一直是这个后基因组时代的焦点。蛋白质之间的相互作用提供了对细胞内生物过程潜在机制的洞察力。可以根据未知蛋白质与已知蛋白质的相互作用证据来推测其功能。因此,蛋白质相互作用网络的系统分析已成为当前生物信息学研究的主要问题。广泛的图形理论或统计方法已尝试有效地分析蛋白质相互作用网络。然而,由于以下挑战,它们在准确性和效率上受到限制。首先,通过大规模高通量实验生成的蛋白质-蛋白质相互作用数据不可靠。接下来,蛋白质相互作用网络通常由复杂的连接性构成。最终,每种蛋白质在不同的环境条件下都具有多种功能。;本文基于小世界现象,无标度分布和层次模块化等独特特征,探索了蛋白质相互作用网络的定量表征。特别是,我专注于蛋白质相互作用网络的准确,高效挖掘,以识别功能模块和预测蛋白质功能。功能模块定义为参与相同功能的最大蛋白质组。作为预处理,网络加权是通过整合基因本体数据库中的功能知识来应用的。语义相似性和语义交互性度量会估计交互可靠性,并将交互可靠性作为权重分配给相应边缘。这些加权的交互网络可以促进对功能知识发现的准确分析。我介绍了四种不同的功能模块识别和功能预测方法。首先,在基于信息流的方法中,我设计了一种新颖的信息流模型,该模型可量化蛋白质功能信息在整个复杂网络中的传播。为了有效地实现该模型,我提出了一种基于随机游动的动态流仿真算法。通过此算法生成的蛋白质的流动模式表明其对其他蛋白质的功能影响。其次,图重构方法基于路径强度和中心性的新定义,将蛋白质相互作用网络检索为面向集线器的层次结构。因此,该算法揭示了按层次组织的功能模块和集线器。接下来,基于关联模式的方法搜索蛋白质相互作用网络中经常出现的功能性关联模式。我将频繁的子图挖掘算法应用于通过将蛋白质功能集分配给节点标签而生成的标签图。最后,图归约是简化蛋白质相互作用网络复杂连接方式的技术。使用缩小的图,通过最小加权割和节点积累的迭代过程来执行模块化。蛋白质-蛋白质相互作用数据的生成正在迅速进行,从而提高了对分析这些复杂数据集的计算方法的需求。本文提出的方法采用新颖,先进的数据挖掘技术来发现隐藏在复杂蛋白质相互作用网络中的有价值的功能知识。这些知识可能是生物医学科学中实际应用的基础,例如疾病诊断和药物开发。当前,正在产生爆炸性数量的异质生物学数据。开发有效的整合方法以整合此类数据是未来研究的有希望的方向。

著录项

  • 作者

    Cho, Young-Rae.;

  • 作者单位

    State University of New York at Buffalo.;

  • 授予单位 State University of New York at Buffalo.;
  • 学科 Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 195 p.
  • 总页数 195
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号