...
首页> 外文期刊>BMC Bioinformatics >Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
【24h】

Maximum likelihood models and algorithms for gene tree evolution with duplications and losses

机译:具有重复和损失的基因树进化的最大似然模型和算法

获取原文

摘要

Background The abundance of new genomic data provides the opportunity to map the location of gene duplication and loss events on a species phylogeny. The first methods for mapping gene duplications and losses were based on a parsimony criterion, finding the mapping that minimizes the number of duplication and loss events. Probabilistic modeling of gene duplication and loss is relatively new and has largely focused on birth-death processes. Results We introduce a new maximum likelihood model that estimates the speciation and gene duplication and loss events in a gene tree within a species tree with branch lengths. We also provide an, in practice, efficient algorithm that computes optimal evolutionary scenarios for this model. We implemented the algorithm in the program DrML and verified its performance with empirical and simulated data. Conclusions In test data sets, DrML finds optimal gene duplication and loss scenarios within minutes, even when the gene trees contain sequences from several hundred species. In many cases, these optimal scenarios differ from the lca-mapping that results from a parsimony gene tree reconciliation. Thus, DrML provides a new, practical statistical framework on which to study gene duplication.
机译:背景技术大量的新基因组数据提供了在物种系统发育上定位基因重复和丢失事件位置的机会。映射基因重复和丢失的第一种方法是基于简约标准,发现了使重复和丢失事件的数量最少的映射。基因复制和丢失的概率模型相对较新,并且主要集中在出生-死亡过程上。结果我们引入了一个新的最大似然模型,该模型可估计具有分支长度的物种树中基因树中的物种形成,基因重复和丢失事件。在实践中,我们还提供了一种有效的算法,可以为该模型计算最佳的进化方案。我们在DrML程序中实现了该算法,并通过经验和模拟数据验证了其性能。结论在测试数据集中,即使当基因树包含数百种物种的序列时,DrML仍能在数分钟内找到最佳的基因复制和丢失情况。在许多情况下,这些最佳方案不同于由简约基因树协调产生的lca映射。因此,DrML提供了一个新的实用的统计框架来研究基因复制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号