首页> 美国卫生研究院文献>Viruses >The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
【2h】

The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses

机译:样本偏差和实验伪像对小核糖核酸病毒统计系统发育分析的影响

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Statistical phylogenetic methods are a powerful tool for inferring the evolutionary history of viruses through time and space. The selection of mathematical models and analysis parameters has a major impact on the outcome, and has been relatively well-described in the literature. The preparation of a sequence dataset is less formalized, but its impact can be even more profound. This article used simulated datasets of enterovirus sequences to evaluate the effect of sample bias on picornavirus phylogenetic studies. Possible approaches to the reduction of large datasets and their potential for introducing additional artefacts were demonstrated. The most consistent results were obtained using “smart sampling”, which reduced sequence subsets from large studies more than those from smaller ones in order to preserve the rare sequences in a dataset. The effect of sequences with technical or annotation errors in the Bayesian framework was also analyzed. Sequences with about 0.5% sequencing errors or incorrect isolation dates altered by just 5 years could be detected by various approaches, but the efficiency of identification depended upon sequence position in a phylogenetic tree. Even a single erroneous sequence could profoundly destabilize the whole analysis by increasing the variance of the inferred evolutionary parameters.
机译:统计系统发生方法是通过时间和空间推断病毒进化史的强大工具。数学模型和分析参数的选择对结果有重大影响,并且在文献中已有相对详尽的描述。序列数据集的准备工作形式化程度较低,但其影响可能会更深远。本文使用肠病毒序列的模拟数据集来评估样本偏倚对小核糖核酸病毒系统发育研究的影响。演示了减少大型数据集的可能方法及其引入其他人工制品的潜力。使用“智能采样”可获得最一致的结果,与大型研究相比,大型研究减少了序列子集,从而将稀有序列保留在数据集中。还分析了具有技术或注释错误的序列在贝叶斯框架中的影响。可以通过多种方法检测到序列错误约为0.5%或分离日期错误更改仅5年的序列,但是鉴定的效率取决于系统树中序列的位置。通过增加推断的进化参数的方差,即使是单个错误序列也可能极大地破坏整个分析的稳定性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号