...
首页> 外文期刊>Molecular biology and evolution >Erasing Errors due to Alignment Ambiguity When Estimating Positive Selection
【24h】

Erasing Errors due to Alignment Ambiguity When Estimating Positive Selection

机译:估计正选择时,由于对齐歧义而导致的擦除错误

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Current estimates of diversifying positive selection rely on first having an accurate multiple sequence alignment. Simulation studies have shown that under biologically plausible conditions, relying on a single estimate of the alignment from commonly used alignment software can lead to unacceptably high false-positive rates in detecting diversifying positive selection. We present a novel statistical method that eliminates excess false positives resulting from alignment error by jointly estimating the degree of positive selection and the alignment under an evolutionary model. Our model treats both substitutions and insertions/deletions as sequence changes on a tree and allows site heterogeneity in the substitution process. We conduct inference starting from unaligned sequence data by integrating over all alignments. This approach naturally accounts for ambiguous alignments without requiring ambiguously aligned sites to be identified and removed prior to analysis. We take a Bayesian approach and conduct inference using Markov chain Monte Carlo to integrate over all alignments on a fixed evolutionary tree topology. We introduce a Bayesian version of the branch-site test and assess the evidence for positive selection using Bayes factors. We compare two models of differing dimensionality using a simple alternative to reversible-jump methods. We also describe a more accurate method of estimating the Bayes factor using Rao-Blackwellization. We then show using simulated data that jointly estimating the alignment and the presence of positive selection solves the problem with excessive false positives from erroneous alignments and has nearly the same power to detect positive selection as when the true alignment is known. We also show that samples taken from the posterior alignment distribution using the software have substantially lower alignment error compared with , , , and alignments
机译:当前对多样化正选择的估计依赖于首先具有精确的多序列比对。仿真研究表明,在生物学上合理的条件下,依靠常用比对软件对比对的单一估计,在检测多样化的阳性选择时会导致不可接受的高假阳性率。我们提出了一种新颖的统计方法,该方法通过共同估计正选择的程度和进化模型下的比对来消除因比对错误而导致的过多假阳性。我们的模型将替换和插入/删除视为树上序列的变化,并允许替换过程中的位点异质性。我们通过整合所有比对,从未比对的序列数据开始进行推理。这种方法自然可以解决歧义对齐问题,而无需在分析之前识别和删除歧义对齐的位点。我们采用贝叶斯方法,并使用马尔可夫链蒙特卡洛进行推理,以将所有路线上的积分整合到固定的进化树拓扑上。我们介绍了分支站点测试的贝叶斯版本,并使用贝叶斯因子评估了阳性选择的证据。我们使用可逆跳跃方法的简单替代方法比较了不同维数的两个模型。我们还描述了一种使用Rao-Blackwellization估计贝叶斯因子的更准确方法。然后,我们显示使用模拟数据共同估计对齐方式和正选择的存在,可以解决错误对齐导致的错误假阳性过多的问题,并且与已知真正对齐时具有几乎相同的检测正选择的能力。我们还显示,与使用,,和对齐方式相比,使用该软件从后对齐方式分布中获取的样本具有更低的对齐误差

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号