首页> 外文期刊>BMC Bioinformatics >Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
【24h】

Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria

机译:通过最大可能性推论网状进化史:信息标准的执行

获取原文
       

摘要

BackgroundMaximum likelihood has been widely used for over three decades to infer phylogenetic trees from molecular data. When reticulate evolutionary events occur, several genomic regions may have conflicting evolutionary histories, and a phylogenetic network may provide a more adequate model for representing the evolutionary history of the genomes or species. A maximum likelihood (ML) model has been proposed for this case and accounts for both mutation within a genomic region and reticulation across the regions. However, the performance of this model in terms of inferring information about reticulate evolution and properties that affect this performance have not been studied.ResultsIn this paper, we study the effect of the evolutionary diameter and height of a reticulation event on its identifiability under ML. We find both of them, particularly the diameter, have a significant effect. Further, we find that the number of genes (which can be generalized to the concept of "non-recombining genomic regions") that are transferred across a reticulation edge affects its detectability. Last but not least, a fundamental challenge with phylogenetic networks is that they allow an arbitrary level of complexity, giving rise to the model selection problem. We investigate the performance of two information criteria, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), for addressing this problem. We find that BIC performs well in general for controlling the model complexity and preventing ML from grossly overestimating the number of reticulation events.ConclusionOur results demonstrate that BIC provides a good framework for inferring reticulate evolutionary histories. Nevertheless, the results call for caution when interpreting the accuracy of the inference particularly for data sets with particular evolutionary features.
机译:背景技术最大可能性已被广泛使用了三十多年,以从分子数据推断出系统发生树。当发生网状进化事件时,几个基因组区域可能具有相互矛盾的进化历史,而系统进化网络可能会提供一个更合适的模型来表示基因组或物种的进化历史。已经针对这种情况提出了最大似然(ML)模型,并考虑了基因组区域内的突变和区域间的网状结构。然而,该模型在推断有关网状结构演化的信息和影响该性能的特性方面的性能尚未得到研究。我们发现它们两者,特别是直径,都具有显着影响。此外,我们发现跨网状边缘转移的基因数量(可以概括为“非重组基因组区域”的概念)会影响其可检测性。最后但并非最不重要的是,系统发育网络的基本挑战是它们允许任意级别的复杂性,从而引起模型选择问题。我们研究了两种信息标准的性能,即Akaike信息标准(AIC)和贝叶斯信息标准(BIC),以解决此问题。我们发现BIC在控制模型复杂性和防止ML严重高估网状事件的数量方面总体上表现良好。结论我们的结果表明BIC为推断网状进化历史提供了一个很好的框架。然而,在解释推理的准确性时,尤其是对于具有特定进化特征的数据集,结果仍需谨慎。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号