...
首页> 外文期刊>Journal of Molecular Biology >Distinguishing enzyme structures from non-enzymes without alignments.
【24h】

Distinguishing enzyme structures from non-enzymes without alignments.

机译:在没有比对的情况下将酶结构与非酶区分开。

获取原文
获取原文并翻译 | 示例

摘要

The ability to predict protein function from structure is becoming increasingly important as the number of structures resolved is growing more rapidly than our capacity to study function. Current methods for predicting protein function are mostly reliant on identifying a similar protein of known function. For proteins that are highly dissimilar or are only similar to proteins also lacking functional annotations, these methods fail. Here, we show that protein function can be predicted as enzymatic or not without resorting to alignments. We describe 1178 high-resolution proteins in a structurally non-redundant subset of the Protein Data Bank using simple features such as secondary-structure content, amino acid propensities, surface properties and ligands. The subset is split into two functional groupings, enzymes and non-enzymes. We use the support vector machine-learning algorithm to develop models that are capable of assigning the protein class. Validation of the method shows that the function can be predicted to an accuracy of 77% using 52 features to describe each protein. An adaptive search of possible subsets of features produces a simplified model based on 36 features that predicts at an accuracy of 80%. We compare the method to sequence-based methods that also avoid calculating alignments and predict a recently released set of unrelated proteins. The most useful features for distinguishing enzymes from non-enzymes are secondary-structure content, amino acid frequencies, number of disulphide bonds and size of the largest cleft. This method is applicable to any structure as it does not require the identification of sequence or structural similarity to a protein of known function.
机译:从结构预测蛋白质功能的能力变得越来越重要,因为解析的结构数量比我们研究功能的能力增长更快。当前预测蛋白质功能的方法主要依赖于鉴定已知功能的相似蛋白质。对于高度不相似或仅与也缺少功能注释的蛋白质相似的蛋白质,这些方法将失败。在这里,我们表明蛋白质功能可以被预测为酶促的或没有酶的而无需借助比对。我们使用简单的功能(例如二级结构含量,氨基酸倾向,表面性质和配体)在蛋白质数据库的结构非冗余子集中描述了1178个高分辨率蛋白质。该子集分为两个功能组,酶和非酶。我们使用支持向量机学习算法来开发能够分配蛋白质类别的模型。该方法的验证表明,使用52个特征来描述每种蛋白质,可以将功能预测为77%的准确性。自适应搜索可能的特征子集会基于36个特征生成简化的模型,这些模型的预测精度为80%。我们将该方法与基于序列的方法进行了比较,该方法也避免了计算比对并预测最近释放的一组无关蛋白。区分酶和非酶的最有用功能是二级结构含量,氨基酸频率,二硫键数量和最大裂口的大小。该方法适用于任何结构,因为它不需要鉴定与已知功能的蛋白质的序列或结构相似性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号