...
首页> 外文期刊>Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation >A Comparison of One-Rate and Two-Rate Inference Frameworks for Site-Specific dN/dS Estimation
【24h】

A Comparison of One-Rate and Two-Rate Inference Frameworks for Site-Specific dN/dS Estimation

机译:用于站点特定dN / dS估计的一速率和二速率推断框架的比较

获取原文
获取原文并翻译 | 示例
           

摘要

Two broad paradigms exist for inferring dN/dS; the ratio of nonsynonymous to synonymous substitution rates, from coding sequences: (i) a one-rate approach, where dN/dS is represented with a single parameter, or (ii) a two-rate approach, where dN and dS are estimated separately. The performances of these two approaches have been well studied in the specific context of proper model specification, i.e., when the inference model matches the simulation model. By contrast, the relative performances of one-rate vs. two-rate parameterizations when applied to data generated according to a different mechanism remain unclear. Here, we compare the relative merits of one-rate and two-rate approaches in the specific context of model misspecification by simulating alignments with mutation-selection models rather than with dN/dS-based models. We find that one-rate frameworks generally infer more accurate dN/dS point estimates, even when dS varies among sites. In other words, modeling dS variation may substantially reduce accuracy of dN/dS point estimates. These results appear to depend on the selective constraint operating at a given site. For sites under strong purifying selection (dN/dS less than or similar to 0.3), one-rate and two-rate models show comparable performances. However, one-rate models significantly outperform two-rate models for sites under moderate-to-weak purifying selection. We attribute this distinction to the fact that, for these more quickly evolving sites, a given substitution is more likely to be nonsynonymous than synonymous. The data will therefore be relatively enriched for nonsynonymous changes, and modeling dS contributes excessive noise to dN/dS estimates. We additionally find that high levels of divergence among sequences, rather than the number of sequences in the alignment, are more critical for obtaining precise point estimates.
机译:存在两种推断dN / dS的广义范式。编码序列中非同义替换率与同义替换率的比率:(i)一种单速率方法,其中dN / dS用单个参数表示;或(ii)一种双速率方法,其中dN和dS分别估算。在适当的模型规范的特定上下文中,即当推理模型与模拟模型匹配时,已经很好地研究了这两种方法的性能。相比之下,当将一速率参数化与二速率参数化应用于根据不同机制生成的数据时的相对性能仍然不清楚。在这里,我们通过模拟与突变选择模型(而不是基于dN / dS的模型)的比对,比较了模型错误指定的特定情况下一比率和二比率方法的相对优点。我们发现,即使当站点之间的dS变化时,一率框架也通常可以推断出更准确的dN / dS点估计。换句话说,建模dS变化可能会大大降低dN / dS点估计的准确性。这些结果似乎取决于在给定站点上运行的选择性约束。对于经过强力净化选择(dN / dS小于或类似于0.3)的场所,一速率和二速率模型显示出可比的性能。但是,对于中等至弱纯化选择条件下的位点,一率模型明显优于二率模型。我们将这种区别归因于以下事实:对于这些更快发展的网站,给定的替换更有可能是同义的,而不是同义的。因此,将针对非同义变化相对丰富数据,并且对dS建模会给dN / dS估计带来过多噪声。我们还发现,序列之间的高差异度,而不是比对中的序列数,对于获得精确的点估计更为关键。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号