...
首页> 外文期刊>Nucleic Acids Research >Combining diverse evidence for gene recognition in completely sequenced bacterial genomes (published erratum appears in Nucleic Acids Res 1998 Aug 15;26(16):following 3870)
【24h】

Combining diverse evidence for gene recognition in completely sequenced bacterial genomes (published erratum appears in Nucleic Acids Res 1998 Aug 15;26(16):following 3870)

机译:结合各种证据,在完全测序的细菌基因组中进行基因识别(已出版的勘误表见《 Nucleic Acids Res》,1998年8月15日; 26(16):3870年之后)

获取原文
获取原文并翻译 | 示例
           

摘要

Analysis of a newly sequenced bacterial genome starts with identification of protein-coding genes. Functional assignment of proteins requires the exact knowledge of protein N-termini. We present a new program ORPHEUS that identifies candidate genes and accurately predicts gene starts. The analysis starts with a database similarity search and identification of reliable gene fragments. The latter are used to derive statistical characteristics of protein-coding regions and ribosome-binding sites and to predict the complete set of genes in the analyzed genome. In a test on Bacillus subtilis and Escherichia coli genomes, the program correctly identified 93.3% (resp. 96.3%) of experimentally annotated genes longer than 100 codons described in the PIR-International database, and for these genes 96.3% (83.9%) of starts were predicted exactly. Furthermore, 98.9% (99.1%) of genes longer than 100 codons annotated in GenBank were found, and 92.9% (75.7%) of predicted starts coincided with the feature table description. Finally, for the complete gene complements of B.subtilis and E.coli , including genes shorter than 100 codons, gene prediction accuracy was 88.9 and 87.1%, respectively, with 94.2 and 76.7% starts coinciding with the existing annotation.
机译:对新测序的细菌基因组的分析始于蛋白质编码基因的鉴定。蛋白质的功能分配需要蛋白质N末端的确切知识。我们提出了一个新的程序ORPHEUS,它可以识别候选基因并准确预测基因起始。分析从数据库相似性搜索和可靠基因片段的识别开始。后者用于推导蛋白质编码区和核糖体结合位点的统计特征,并预测所分析基因组中完整的基因集。在对枯草芽孢杆菌和大肠杆菌基因组的测试中,该程序正确识别了93.3%(分别为96.3%)长于PIR-International数据库中描述的超过100个密码子的实验注释基因,而对于这些基因,它正确识别了96.3%(83.9%)确切地预测了开始。此外,发现有98.9%(99.1%)个基因长于GenBank中注释的100个密码子,而92.9%(75.7%)的预测起始与特征表描述相符。最后,对于枯草芽孢杆菌和大肠杆菌的完整基因补体,包括短于100个密码子的基因,基因预测准确性分别为88.9%和87.1%,其中94.2%和76.7%开始与现有注释相吻合。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号