首页> 外文学位 >Combinatorial Approaches to Accurate Identification of Orthologous Genes.
【24h】

Combinatorial Approaches to Accurate Identification of Orthologous Genes.

机译:准确鉴定直系同源基因的组合方法。

获取原文
获取原文并翻译 | 示例

摘要

The accurate identification of orthologous genes across different species is a critical and challenging problem in comparative genomics and has a wide spectrum of biological applications including gene function inference, evolutionary studies and systems biology. During the past several years, many methods have been proposed for ortholog assignment based on sequence similarity, phylogenetic approaches, synteny information, and genome rearrangement. Although these methods share many commonly assigned orthologs, each method tends to produce an ortholog assignment significantly different from the others.;In this dissertation, we study the problem of assigning orthologous genes among closely related genomes on a genome scale. We first give a brief review of the existing methods for ortholog assignment in the literature, followed by a comprehensive comparison of each method. We then propose a new combinatorial approach for assigning ortholog pairs between a pair of closely related genomes by addressing the limitations of the existing methods. Our approach is based on the parsimony principle to transform one genome to another by minimizing the number of genome rearrangement events, including reversal, transposition, fusion, fission and gene duplications. By explicitly incorporating tandem gene duplication model and combining phylogenetic approaches, we develop an improved system MSOAR 2.0. Our experimental results on both simulated data and real data show that MSOAR 2.0 achieves the highest overall prediction accuracy among different programs in comparison.;Based on pairwise genome comparison results, we extend our ortholog assignment method to multiple genome comparison and develop a new system MultiMSOAR 2.0 to identify ortholog groups among multiple genomes. In MultiMSOAR 2.0, pairwise orthology information produced by MSOAR 2.0 is used to construct multipartite graphs for each gene family. In order to partition each gene family into a set of disjoint sets of orthologous genes, a multidimensional matching problem is formulated and a heuristic maximum weight matching algorithm is proposed. The partition results are then used to label the species tree. Considering some biological constraints, we formulate the tree labeling problem in the combinatorial optimization framework and develop two dynamic programming algorithms to solve the problem. Our experimental results show that MultiMSOAR 2.0 achieves much higher prediction accuracy than the existing ortholog assignment systems for multiple genomes. Moreover, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest.
机译:在比较基因组学中,准确识别不同物种的直系同源基因是一个至关重要的挑战性问题,在生物学上的广泛应用包括基因功能推断,进化研究和系统生物学。在过去的几年中,已经提出了许多基于序列相似性,系统发育方法,同义信息和基因组重排的直系同源物分配方法。尽管这些方法共享许多通常分配的直系同源物,但每种方法都倾向于产生与其他方法显着不同的直系同源物。;本文,我们研究了在基因组规模上在紧密相关的基因组之间分配直系同源基因的问题。我们首先简要回顾一下文献中直系同源物分配的现有方法,然后对每种方法进行全面比较。然后,我们提出了一种新的组合方法,通过解决现有方法的局限性,在一对紧密相关的基因组之间分配直系同源物对。我们的方法基于简约原则,通过最小化基因组重排事件(包括反向,转座,融合,裂变和基因重复)的数量,将一个基因组转化为另一个基因组。通过显式纳入串联基因复制模型并结合系统发育方法,我们开发了改进的系统MSOAR 2.0。我们在模拟数据和真实数据上的实验结果表明,MSOAR 2.0在不同程序之间具有最高的整体预测准确性。基于成对的基因组比较结果,我们将直系同源物分配方法扩展到多基因组比较,并开发了新的系统MultiMSOAR 2.0可以识别多个基因组中的直系同源物组。在MultiMSOAR 2.0中,MSOAR 2.0产生的成对正交信息用于构建每个基因家族的多部分图。为了将每个基因家族划分为一组不相交的直系同源基因,提出了一个多维匹配问题,并提出了启发式最大权重匹配算法。然后将分区结果用于标记物种树。考虑到一些生物学上的限制,我们在组合优化框架中制定了树标记问题,并开发了两种动态规划算法来解决该问题。我们的实验结果表明,MultiMSOAR 2.0的预测准确度比现有的多基因组直系同源物分配系统要高得多。此外,MultiMSOAR 2.0还提供了有关基因出生,复制和进化中损失的信息,这可能与生物学无关。

著录项

  • 作者

    Shi, Guanqun.;

  • 作者单位

    University of California, Riverside.;

  • 授予单位 University of California, Riverside.;
  • 学科 Biology Systematic.;Computer Science.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 86 p.
  • 总页数 86
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号