...
首页> 外文期刊>Journal of Mathematical Biology >Counting and sampling gene family evolutionary histories in the duplication-loss and duplication-loss-transfer models
【24h】

Counting and sampling gene family evolutionary histories in the duplication-loss and duplication-loss-transfer models

机译:重复损失和复制损失转移模型中计数和取样基因家族进化历史

获取原文
获取原文并翻译 | 示例

摘要

Given a set of species whose evolution is represented by a species tree, a gene family is a group of genes having evolved from a single ancestral gene. A gene family evolves along the branches of a species tree through various mechanisms, including-but not limited to-speciation (S), gene duplication (D), gene loss (L), and horizontal gene transfer (T). The reconstruction of a gene tree representing the evolution of a gene family constrained by a species tree is an important problem in phylogenomics. However, unlike in the multispecies coalescent evolutionary model that considers only speciation and incomplete lineage sorting events, very little is known about the search space for gene family histories accounting for gene duplication, gene loss and horizontal gene transfer (the DLT-model). In this work, we introduce the notion of evolutionary histories defined as a binary ordered rooted tree describing the evolution of a gene family, constrained by a species tree in the DLT-model. We provide formal grammars describing the set of all evolutionary histories that are compatible with a given species tree, whether it is ranked or unranked. These grammars allow us, using either analytic combinatorics or dynamic programming, to efficiently compute the number of histories of a given size, and also to generate random histories of a given size under the uniform distribution. We apply these tools to obtain exact asymptotics for the number of gene family histories for two species trees, the rooted caterpillar and complete binary tree, as well as estimates of the range of the exponential growth factor of the number of histories for random species trees of size up to 25. Our results show that including horizontal gene transfers induce a dramatic increase of the number of evolutionary histories. We also show that, within ranked species trees, the number of evolutionary histories in the DLT-model is almost independent of the species tree topology. These results establish firm foundations for the development of ensemble methods for the prediction of reconciliations.
机译:给定一组种子由物种树代表的物种,基因家族是一种从单一祖先基因演变的一组基因。基因家族沿着物种树的分支通过各种机制而发展,包括但不限于物种,基因重复(D),基因损失(L)和水平基因转移(T)。代表物种树限制的基因家族的演化的基因树的重建是系统核糖组织中的重要问题。然而,与多数的聚结演化模型不同,认为只考虑物种和不完整的谱系分类事件,关于基因家族历史的搜索空间很少,核算基因重复,基因丧失和水平基因转移(DLT-Model)。在这项工作中,我们介绍了作为二元有序的生根树定义的进化历史的概念,所述生根树描述了由DLT模型中的物种树限制的基因家族的演变。我们提供描述与给定物种树兼容的所有进化历史集合的正式语法,无论是排名还是不纳。这些语法允许我们使用分析组合或动态编程,以有效地计算给定尺寸的历史数量,以及在均匀分布下产生给定尺寸的随机历史。我们应用这些工具,以获得两个物种树木,根茎毛虫和完整二叉树的基因家族历史数量的确切渐近学,以及随机物种树木历史数量的指数增长因子的范围的估计大小最多25.我们的结果表明,包括水平基因转移诱导进化历史数量的显着增加。我们还表明,在排名的物种树内,DLT模型中的进化历史数量几乎独立于物种树拓扑。这些结果建立了开发合并方法的坚定基础,以便预测对账。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号