首页> 外文期刊>Bioinformatics >Accurate sequence-based prediction of catalytic residues
【24h】

Accurate sequence-based prediction of catalytic residues

机译:基于序列的催化残基的准确预测

获取原文
获取原文并翻译 | 示例
       

摘要

MOTIVATION: Prediction of catalytic residues provides useful information for the research on function of enzymes. Most of the existing prediction methods are based on structural information, which limits their use. We propose a sequence-based catalytic residue predictor that provides predictions with quality comparable to modern structure-based methods and that exceeds quality of state-of-the-art sequence-based methods. RESULTS: Our method (CRpred) uses sequence-based features and the sequence-derived PSI-BLAST profile. We used feature selection to reduce the dimensionality of the input (and explain the input) to support vector machine (SVM) classifier that provides predictions. Tests on eight datasets and side-by-side comparison with six modern structure- and sequence-based predictors show that CRpred provides predictions with quality comparable to current structure-based methods and better than sequence-based methods. The proposed method obtains 15-19% precision and 48-58% TP (true positive) rate, depending on the dataset used. CRpred also provides confidence values that allow selecting a subset of predictions with higher precision. The improved quality is due to newly designed features and careful parameterization of the SVM. The features incorporate amino acids characterized by the highest and the lowest propensities to constitute catalytic residues, Gly that provides flexibility for catalytic sites and sequence motifs characteristic to certain catalytic reactions. Our features indicate that catalytic residues are on average more conserved when compared with the general population of residues and that highly conserved amino acids characterized by high catalytic propensity are likely to form catalytic sites. We also show that local (with respect to the sequence) hydrophobicity contributes towards the prediction.
机译:动机:催化残基的预测为酶功能的研究提供了有用的信息。现有的大多数预测方法都基于结构信息,这限制了它们的使用。我们提出了一种基于序列的催化残基预测器,该预测器提供的预测质量可与基于现代结构的方法相媲美,并且超出了最新的基于序列的方法的质量。结果:我们的方法(CRpred)使用了基于序列的特征和基于序列的PSI-BLAST配置文件。我们使用特征选择来减少输入的维数(并解释输入),以支持提供预测的矢量机(SVM)分类器。对八个数据集进行的测试以及与六个现代基于结构和基于序列的预测器并排比较表明,CRpred提供的预测质量可与当前基于结构的方法媲美,并且优于基于序列的方法。根据所使用的数据集,所提出的方法可获得15-19%的精度和48-58%的TP(真实阳性)率。 CRpred还提供了置信度值,使您可以选择精度更高的预测子集。由于新设计的功能和对SVM的仔细参数化,质量得以提高。这些特征结合了具有最高和最低倾向构成催化残基的氨基酸,Gly为某些催化反应所特有的催化位点和序列基序提供了灵活性。我们的特征表明,与一般的残基群体相比,催化残基平均而言更保守,并且以高催化倾向性为特征的高度保守的氨基酸很可能形成催化位点。我们还表明,局部(相对于序列)疏水性有助于预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号