首页> 美国卫生研究院文献>Nature Public Health Emergency Collection >Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach
【2h】

Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach

机译:支持向量机方法从一级序列预测跨膜蛋白。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporterson-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
机译:从其序列预测跨膜(TM)蛋白质有助于基因组的功能研究和潜在的膜相关治疗靶标的搜索。已经开发了预测TM序列的计算方法。这些方法对许多TM蛋白都具有很高的预测准确性,但其中某些方法对特定类别的TM蛋白效果较差。而且,它们的性能已经通过使用相对少量的TM和非膜(NM)蛋白进行了测试。因此,通过使用一组更多样化的蛋白质并测试其在特定类别的TM蛋白质上的性能来评估TM蛋白质预测方法非常有用。这项工作广泛评估了支持向量机(SVM)分类系统预测TM蛋白和几种TM类蛋白的能力。通过使用来自Pfam蛋白家族的14962 TM和12168 NM蛋白对这些SVM系统进行了培训和测试。来自策展的Pfam家族的一组独立的3389 TM和6063 NM蛋白用于进一步评估这些SVM系统的性能。分别正确预测了TM蛋白和NM蛋白的90.1%和86.7%,与其他研究的结果相当。对于G蛋白偶联受体,包膜蛋白,外膜蛋白和转运蛋白/通道,特定TM类蛋白的预测准确度分别为95.6%,90.0%,92.7%和73.9%。非G蛋白偶联受体,非包膜蛋白,非外膜蛋白和非转运蛋白/非通道蛋白分别为98.1%,99.5%,86.4%和98.6%。通过使用比以前的研究大得多且数量更多的蛋白质进行测试,SVM系统似乎能够以与以前的研究相当的准确性预测TM蛋白质和特定TM类别的蛋白质。可以从http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi访问我们的SVM系统SVMProt。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号