...
首页> 外文期刊>Nucleic acids research >Detection of new genes in a bacterial genome using Markov models for three gene classes
【24h】

Detection of new genes in a bacterial genome using Markov models for three gene classes

机译:使用马尔可夫模型检测细菌基因组中三个基因类别的新基因

获取原文
   

获取外文期刊封面封底 >>

       

摘要

We further investigated the statistical features of the three classes of Escherichia coli genes that have been previously delineated by factorial correspondence analysis and dynamic clustering methods. A phased Markov model for a nucleotide sequence of each gene class was developed and employed for gene prediction using the GeneMark program. The protein-coding region prediction accuracy was determined for classspecific Markov models of different orders when the programs implementing these models were applied to gene sequences from the same or other classes. It is shown that at least two training sets and two program versions derived for different classes of E.coli genes are necessary in order to achieve a high accuracy of coding region prediction for uncharacterized sequences. Some annotated E.coli genes from Class I and Class III are shown to be spurious, whereas many open reading frames (ORFs) that have not been annotated in GenBank as genes are predicted to encode proteins. The amino acid sequences of the putative products of these ORFs initially did not show similarity to already known proteins. However, conserved regions have been identified in several of them by screening the latest entries in protein sequence databases and applying methods for motif search, while some other of these new genes have been identified in independent experiments.
机译:我们进一步调查了三类大肠杆菌基因的统计特征,这些特征先前已通过因子对应分析和动态聚类方法进行了描述。针对每个基因类别的核苷酸序列,开发了一个分阶段的马尔可夫模型,并使用GeneMark程序将其用于基因预测。当将实现这些模型的程序应用于来自相同或其他类别的基因序列时,对于不同顺序的特定类马尔可夫模型,可以确定蛋白质编码区域的预测准确性。结果表明,至少有两个训练集和两个程序版本必须从不同类别的大肠杆菌基因中获得,以实现针对未表征序列的编码区预测的高精度。一些来自I类和III类的带注释的大肠杆菌基因显示为伪造的,而许多开放阅读框(ORF)尚未在GenBank中进行注释,因为预计基因会编码蛋白质。这些ORF的推定产物的氨基酸序列最初与已知蛋白质没有相似性。但是,通过筛选蛋白质序列数据库中的最新条目并应用基序搜索方法,已经在其中几个中鉴定了保守区,而这些新基因中的其他一些则是在独立实验中鉴定出来的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号