首页> 外文期刊>Bioinformatics >An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes
【24h】

An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes

机译:用于跨膜蛋白检测和拓扑预测的改进的隐马尔可夫模型及其在完整基因组中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Knowledge of the transmembrane helical topology can help identify binding sites and infer functions for membrane proteins. However, because membrane proteins are hard to solubilize and purify, only a very small amount of membrane proteins have structure and topology experimentally determined. This has motivated various computational methods for predicting the topology of membrane proteins.Results: We present an improved hidden Markov model, TMMOD, for the identification and topology prediction of transmembrane proteins. Our model uses TMHMM as a prototype, but differs from TMHMM by the architecture of the submodels for loops on both sides of the membrane and also by the model training procedure. In cross-validation experiments using a set of 83 transmembrane proteins with known topology, TMMOD outperformed TMHMM and other existing methods, with an accuracy of 89% for both topology and locations. In another experiment using a separate set of 160 transmembrane proteins, TMMOD had 84% for topology and 89% for locations. When utilized for identifying transmembrane proteins from non-transmembrane proteins, particularly signal peptides, TMMOD has consistently fewer false positives than TMHMM does. Application of TMMOD to a collection of complete genomes shows that the number of predicted membrane proteins accounts for similar to 20-30% of all genes in those genomes, and that the topology where both the N- and C-termini are in the cytoplasm is dominant in these organisms except for Caenorhabditis elegans.
机译:动机:跨膜螺旋拓扑的知识可以帮助识别结合位点并推断膜蛋白的功能。但是,由于膜蛋白难以溶解和纯化,因此只有极少量的膜蛋白具有通过实验确定的结构和拓扑。结果,我们提出了一种改进的隐马尔可夫模型TMMOD,用于跨膜蛋白的鉴定和拓扑预测。我们的模型使用TMHMM作为原型,但与TMHMM的区别在于膜两侧的环的子模型的体系结构以及模型训练过程。在使用一组83种具有已知拓扑结构的跨膜蛋白的交叉验证实验中,TMMOD优于TMHMM和其他现有方法,其拓扑结构和位置的准确性均达到89%。在另一套使用单独的160个跨膜蛋白的实验中,TMMOD的拓扑结构占84%,位置占89%。当用于从非跨膜蛋白,特别是信号肽中鉴定跨膜蛋白时,TMMOD的假阳性率始终低于TMHMM。 TMMOD在完整基因组集合中的应用表明,预测的膜蛋白数量大约占那些基因组中所有基因的20-30%,并且N和C末端都在细胞质中的拓扑结构是除秀丽隐杆线虫外,在这些生物中占优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号