首页> 外文学位 >Weighted Graph Matching Approaches to Structure Comparison and Alignment and their Application to Biological Problems.
【24h】

Weighted Graph Matching Approaches to Structure Comparison and Alignment and their Application to Biological Problems.

机译:结构比较和比对的加权图匹配方法及其在生物问题中的应用。

获取原文
获取原文并翻译 | 示例

摘要

In pattern recognition and machine learning, comparing and contrasting are the most fundamental operations: from similarities we derive common rules encoded in the systems, while from difference we infer what makes each system unique. The biological sciences are not an exception to these operations and, in fact, rely heavily on their use. More recently, the emergence of high-throughput measurement technologies has highlighted the need for novel approaches capable of enhancing our ability to understand complex relationships in these data sets. Often, these relationships can be best represented using graphs (or networks), where nodes are biochemical components such as genes, RNAs, proteins or metabolites, and edges indicate the types (and often quality) of relationship. Comparison of relationships is generally performed by aligning the networks of interest. For example, for protein-protein interaction (PPI) networks, the goal of network alignment is to find mappings between nodes (proteins) which are highly useful in identifying signaling pathways or protein complexes and to annotate genes of unknown functionality from subnetworks conserved across multiple species. Phylogenetic trees are also graph structures that describe evolutionary relationship among groups of organisms and their hypothetical ancestors. As it has been shown in a large volume of previous work, comparison of trees also opens the possibility of supporting or building new evolutionary hypotheses: for example, the detection of host-parasite symbiosis, gene coevolution as a signal of physical interactions among genes, or nonstandard events such as horizontal gene transfer.;The goal of this thesis is to develop and implement a flexible set of algorithms and methodologies that can be used for the alignment of trees and/or networks having various sizes and properties. We first define a new relaxed model of graph isomorphism in which the shortest path lengths are preserved between corresponding intra-node pairs. Then, based on Google's PageRank model, we present a new tree matching approach, phyloAligner , which resolves several weakness of previous approaches. We further generalize this tree matching algorithm to a broader flexible framework, MCS-Finder, as a scalable and error-tolerant approximation for identifying the maximum common substructure between weighted graphs or distance matrices of different sizes. For phylogenetic trees with weighted edges and strictly-labeled nodes, multidimensional scaling-based methods, xCEED, can effectively evaluate the structural similarity and identify which regions are congruent/incongruent. These methods successfully detected coevolutionary signals as well as nonstandard evolutionary events such as horizontal gene transfer, and recovered interaction specificity between multigene families.
机译:在模式识别和机器学习中,比较和对比是最基本的操作:从相似性中我们得出系统中编码的通用规则,而从差异中我们得出使每个系统独特的原因。生物科学不是这些操作的例外,实际上,很大程度上依赖于它们的使用。最近,高通量测量技术的出现凸显了对能够增强我们理解这些数据集中复杂关系能力的新颖方法的需求。通常,可以使用图形(或网络)来最好地表示这些关系,其中节点是生化成分,例如基因,RNA,蛋白质或代谢物,而边缘则表示关系的类型(通常是质量)。关系的比较通常是通过对齐感兴趣的网络来执行的。例如,对于蛋白质-蛋白质相互作用(PPI)网络,网络对齐的目标是找到节点(蛋白质)之间的映射,这些映射对于识别信号传导途径或蛋白质复合物非常有用,并从多个子域中保守的子网注释未知功能的基因种类。系统发生树也是图结构,它描述了生物群及其假设祖先之间的进化关系。正如先前的大量工作所表明的那样,树木的比较也为支持或建立新的进化假设提供了可能性:例如,检测宿主-寄生虫共生,将基因共同进化作为基因间物理相互作用的信号,本论文的目的是开发和实现一套灵活的算法和方法,可用于比对具有各种大小和特性的树木和/或网络。我们首先定义一个新的图同构松弛模型,其中在对应的节点内对之间保留最短路径长度。然后,基于Google的PageRank模型,我们提出了一种新的树匹配方法phyloAligner,它解决了以前方法的一些缺点。我们进一步将此树匹配算法推广到更广泛的灵活框架MCS-Finder,作为可扩展且具有容错能力的近似值,用于识别不同大小的加权图或距离矩阵之间的最大公共子结构。对于具有加权边缘和严格标记节点的系统发育树,基于多维缩放的方法xCEED可以有效地评估结构相似性,并确定哪些区域一致/不一致。这些方法成功检测到协同进化信号以及非标准进化事件,例如水平基因转移,并恢复了多基因家族之间的相互作用特异性。

著录项

  • 作者

    Choi, Kwangbom.;

  • 作者单位

    The University of North Carolina at Chapel Hill.;

  • 授予单位 The University of North Carolina at Chapel Hill.;
  • 学科 Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 110 p.
  • 总页数 110
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:45:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号