首页> 外文学位 >Inference of parsimonious species phylogenies from multi-locus data.
【24h】

Inference of parsimonious species phylogenies from multi-locus data.

机译:从多位点数据推断亚种的系统发育。

获取原文
获取原文并翻译 | 示例

摘要

The main focus of this dissertation is the inference of species phylogenies, i.e. evolutionary histories of species. Species phylogenies allow us to gain insights into the mechanisms of evolution and to hypothesize past evolutionary events. They also find applications in medicine, for example, the understanding of antibiotic resistance in bacteria. The reconstruction of species phylogenies is, therefore, of both biological and practical importance.;In the traditional method for inferring species trees from genetic data, we sequence a single locus in species genomes, reconstruct a gene tree, and report it as the species tree. Biologists have long acknowledged that a gene tree can be different from a species tree, thus implying that this traditional method might infer the wrong species tree. Moreover, reticulate events such as horizontal gene transfer and hybridization make the evolution of species no longer tree-like. The availability of multi-locus data provides us with excellent opportunities to resolve those long standing problems. In this dissertation, we present parsimony-based algorithms for reconciling species/gene tree incongruence that is assumed to be due solely to lineage sorting. We also describe a unified framework for detecting hybridization despite lineage sorting.;To address the first problem of species/gene tree incongruence caused by lineage sorting, we present three algorithms. In Chapter 3, we present an algorithm based on an integer-linear programming (ILP) formula to infer the species tree's topology and divergence times from multiple gene trees. In Chapter 4, we describe two methods that infer the species tree by minimizing deep coalescences (MDC), a criterion introduced by Maddison in 1997. The first method is also based on an ILP formula, but it eliminates the enumeration phase of candidate species trees of the algorithm in Chapter 3. The second algorithm further eliminates the dependence on external ILP solvers by employing dynamic programming. We ran those methods on both biological and simulated data, and experimental results demonstrate their high accuracy and speed in species tree inference, which makes them suitable for analyzing multi-locus data.;The second problem this dissertation deals with is reticulation (e.g., horizontal gene transfer, hybridization) detection despite lineage sorting. The phylogeny-based approach compares the evolutionary histories of different genomic regions and test them for incongruence that would indicate hybridization. However, since species tree and gene tree incongruence can also be due to lineage sorting, phylogeny-based hybridization methods might overestimate the amount of hybridization. We present in this dissertation a framework that can handle both hybridization and lineage sorting simultaneously. In this framework, we extend the MDC criterion to phylogenetic networks, and use it to propose a heuristic to detect hybridization despite lineage sorting. Empirical results on a simulated and a yeast data set show its promising performance, as well as several directions for future research.
机译:本文的主要重点是物种系统发育的推论,即物种的进化历史。物种系统发育使我们能够深入了解进化的机制并假设过去的进化事件。他们还发现了在医学中的应用,例如,了解细菌对抗生素的抗性。因此,物种系统发育的重建具有生物学和实际意义。在从遗传数据推断物种树的传统方法中,我们对物种基因组中的单个基因座进行测序,重建基因树,并将其报告为物种树。生物学家早就承认基因树可能不同于物种树,因此暗示这种传统方法可能会推断出错误的物种树。而且,诸如水平基因转移和杂交之类的网状事件使物种的进化不再像树一样。多位置数据的可用性为我们提供了解决这些长期存在的问题的绝好机会。在本文中,我们提出了基于简约的算法来调和物种/基因树的不一致性,这被认为完全是由于谱系排序。我们还描述了一个统一的框架,用于尽管进行谱系排序仍可检测杂交。为了解决由谱系排序引起的物种/基因树不一致的第一个问题,我们提出了三种算法。在第3章中,我们提出了一种基于整数线性规划(ILP)公式的算法,可以从多个基因树推断物种树的拓扑结构和发散时间。在第4章中,我们描述了两种通过最小化深度合并(MDC)来推断物种树的方法,这是Maddison在1997年引入的标准。第一种方法也是基于ILP公式,但是它消除了候选物种树的枚举阶段。第三章中的算法。第二种算法通过采用动态编程进一步消除了对外部ILP求解器的依赖。我们在生物学和模拟数据上都使用了这些方法,实验结果证明了它们在物种树推理中的高准确性和速度,这使其适用于分析多位点数据。本论文要解决的第二个问题是网状结构(例如水平基因转移,杂交)检测,尽管沿袭排序。基于系统发育的方法比较了不同基因组区域的进化历史,并测试了它们之间的不一致,从而表明了杂交。但是,由于物种树和基因树的不一致性也可能是由于谱系排序所致,因此基于系统进化的杂交方法可能会高估杂交量。我们在本文中提出了一个可以同时处理杂交和谱系分类的框架。在此框架中,我们将MDC标准扩展到了系统进化网络,并使用它提出了一种启发式方法来检测沿袭分类的杂交。在模拟和酵母数据集上的经验结果显示了其令人鼓舞的性能,以及未来研究的几个方向。

著录项

  • 作者

    Than, Cuong V.;

  • 作者单位

    Rice University.;

  • 授予单位 Rice University.;
  • 学科 Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 162 p.
  • 总页数 162
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号