...
首页> 外文期刊>BMC Evolutionary Biology >Incorporating indel information into phylogeny estimation for rapidly emerging pathogens
【24h】

Incorporating indel information into phylogeny estimation for rapidly emerging pathogens

机译:将indel信息纳入对快速出现的病原体的系统发育估计中

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Phylogenies of rapidly evolving pathogens can be difficult to resolve because of the small number of substitutions that accumulate in the short times since divergence. To improve resolution of such phylogenies we propose using insertion and deletion (indel) information in addition to substitution information. We accomplish this through joint estimation of alignment and phylogeny in a Bayesian framework, drawing inference using Markov chain Monte Carlo. Joint estimation of alignment and phylogeny sidesteps biases that stem from conditioning on a single alignment by taking into account the ensemble of near-optimal alignments. Results We introduce a novel Markov chain transition kernel that improves computational efficiency by proposing non-local topology rearrangements and by block sampling alignment and topology parameters. In addition, we extend our previous indel model to increase biological realism by placing indels preferentially on longer branches. We demonstrate the ability of indel information to increase phylogenetic resolution in examples drawn from within-host viral sequence samples. We also demonstrate the importance of taking alignment uncertainty into account when using such information. Finally, we show that codon-based substitution models can significantly affect alignment quality and phylogenetic inference by unrealistically forcing indels to begin and end between codons. Conclusion These results indicate that indel information can improve phylogenetic resolution of recently diverged pathogens and that alignment uncertainty should be considered in such analyses.
机译:背景技术快速发展的病原体的系统发育可能难以解析,因为自发散后短时间内积累的替代数量很少。为了提高此类系统发育的分辨率,我们建议除了替代信息之外,还使用插入和缺失(插入/缺失)信息。我们通过在贝叶斯框架中联合估计比对和系统发育来完成此任务,并使用马尔可夫链蒙特卡洛进行推断。比对和系统发生回避步距的联合估计,是由于考虑了接近最佳比对的集合而对基于单个比对的条件产生了偏见。结果我们引入了一种新颖的马尔可夫链跃迁内核,该内核通过提出非局部拓扑重排以及通过块采样对齐和拓扑参数来提高计算效率。此外,我们扩展了先前的indel模型,通过将indel优先放置在较长的分支上来提高生物现实性。我们证明了indel信息能够增加从宿主内部病毒序列样本中提取的实例的系统发育分辨率。我们还证明了在使用此类信息时考虑对齐不确定性的重要性。最后,我们表明基于密码子的替换模型可以通过不切实际地强迫插入缺失开始于密码子之间的开始和结束来显着影响比对质量和系统发育推断。结论这些结果表明,插入缺失信息可以改善最近分化的病原体的系统发育分辨率,并且在这种分析中应考虑比对不确定性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号