首页> 外文期刊>Molecular biology and evolution >False-Positive Selection Identified by ML-Based Methods: Examples from the Sig1 Gene of the Diatom Thalassiosira weissflogii and the tax Gene of a Human T-cell Lymphotropic Virus
【24h】

False-Positive Selection Identified by ML-Based Methods: Examples from the Sig1 Gene of the Diatom Thalassiosira weissflogii and the tax Gene of a Human T-cell Lymphotropic Virus

机译:通过基于ML的方法确定的假阳性选择:来自硅藻Thalassiosira weissflogii的Sig1基因和人类T细胞淋巴病毒的tax基因的例子

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Sexually induced gene 1 (Sig1) in the centric diatom Thalassiosira weissflogii is considered to encode a gamete recognition protein. Sorhannus (2003) analyzed nucleotide sequences of Sig1 using parsimony analysis and the maximum-likelihood (ML)–based Bayesian method for inferring positive selection at single amino acid sites and reported that positively selected sites were detected by the latter method but not by the former. He then concluded that for this type of study, the ML-based method is more reliable than parsimony analysis. Here we show that his results apparently represent false-positive cases of the ML-based method and that there is no solid evidence that this gene contains positively selected sites. We further demonstrate that in the tax gene of human T-cell lymphotropic virus type I (HTLV-I), all codon sites, including invariable sites, can be inferred as positively selected sites by the ML-based method. These observations indicate that the ML-based method may produce many false-positive sites. One of the main reasons for the occurrence of false positives is that in the ML-based method, codon sites are grouped into several categories, with different nonsynonymous/synonymous rate ratios (ωs), on a purely statistical basis, and positive selection is inferred indirectly by examining whether the average ω for each category is greater than 1. In parsimony analysis, however, the evolutionary change of nucleotides at each codon site is examined. For this reason, parsimony-based methods rarely produce false positives and are safer than ML-based methods for detecting positive selection at individual codon sites, although a large number of sequences are necessary.
机译:中心硅藻Thalassiosira weissflogii中的性诱导基因1(Sig1)被认为编码配子识别蛋白。 Sorhannus(2003)使用简约分析和基于最大似然(ML)的贝叶斯方法分析了Sig1的核苷酸序列,以推断单个氨基酸位点的阳性选择,并报告说,后一种方法可检测到阳性选择的位点,而前者则未检测到。然后他得出结论,对于这种类型的研究,基于ML的方法比简约分析更可靠。在这里,我们表明他的结果显然代表了基于ML的方法的假阳性案例,并且没有确凿证据表明该基因包含阳性选择位点。我们进一步证明,在人类I型T细胞淋巴病毒(HTLV-I)的税收基因中,所有密码子位点,包括不变位点,都可以通过基于ML的方法推断为阳性选择位点。这些观察结果表明,基于ML的方法可能会产生许多假阳性位点。发生假阳性的主要原因之一是,在基于ML的方法中,基于纯粹的统计数据,密码子位点被分为几类,具有不同的非同义/同义比率(ωs),并且可以推断出正选择。通过检查每个类别的平均ω值是否间接地间接地大于1。但是,在简约分析中,检查了每个密码子位点核苷酸的进化变化。由于这个原因,尽管需要大量序列,基于简约的方法很少会产生假阳性,并且比基于ML的方法更容易检测单个密码子位点的阳性选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号