首页> 外文期刊>Nucleic Acids Research >Positional characterisation of false positives from computational prediction of human splice sites.
【24h】

Positional characterisation of false positives from computational prediction of human splice sites.

机译:根据人类剪接位点的计算预测,假阳性的位置特征。

获取原文
获取原文并翻译 | 示例
           

摘要

The performance of computational tools that can predict human splice sites are reviewed using a test set of EST-confirmed splice sites. The programs (namely HMMgene, NetGene2, HSPL, NNSPLICE, SpliceView and GeneID-3) differ from one another in the degree of discriminatory information used for prediction. The results indicate that, as expected, HMMgene and NetGene2 (which use global as well as local coding information and splice signals) followed by HSPL (which uses local coding information and splice signals) performed better than the other three programs (which use only splice signals). For the former three programs, one in every three false positive splice sites was predicted in the vicinity of true splice sites while only one in every 12 was expected to occur in such a region by chance. The persistence of this observation for programs (namely FEXH, GRAIL2, MZEF, GeneID-3, HMMgene and GENSCAN) that can predict all the potential exons (including optimal and sub-optimal) was assessed. In a high proportion (>50%) of the partially correct predicted exons, the incorrect exon ends were located in the vicinity of the real splice sites. Analysis of the distribution of proximal false positives indicated that the splice signals used by the algorithms are not strong enough to discriminate particularly those false predictions that occur within +/- 25 nt around the real sites. It is therefore suggested that specialised statistics that can discriminate real splice sites from proximal false positives be incorporated in gene prediction programs.
机译:使用EST确认的剪接位点测试集,可以评估可预测人类剪接位点的计算工具的性能。这些程序(即HMMgene,NetGene2,HSPL,NNSPLICE,SpliceView和GeneID-3)在用于预测的区分信息的程度上彼此不同。结果表明,正如预期的那样,HMMgene和NetGene2(使用全局以及本地编码信息和剪接信号)紧随其后的HSPL(使用本地编码信息和剪接信号)比其他三个程序(仅使用剪接)表现更好信号)。对于前三个程序,预计在真正的剪接位点附近,每三个假阳性剪接位点中就有一个,而在该区域中,偶然地每12个预测中将出现一个。评估了该观察结果对可以预测所有潜在外显子(包括最佳和次优)的程序(即FEXH,GRAIL2,MZEF,GeneID-3,HMMgene和GENSCAN)的持久性。在部分正确的预测外显子中,很大一部分(> 50%)中,错误的外显子末端位于真实的剪接位点附近。对近端假阳性分布的分析表明,算法使用的剪接信号强度不足以特别地区分真实位置周围+/- 25 nt内出现的那些假预测。因此,建议将可以将真实剪接位点与近端假阳性区分开的专门统计数据纳入基因预测程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号