...
首页> 外文期刊>Molecular simulation >Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines
【24h】

Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines

机译:使用支持向量机从蛋白质序列的2D图形表示形式对蛋白质突变体的构象稳定性进行分类

获取原文
获取原文并翻译 | 示例
           

摘要

Euclidean distance counts derived from the protein 2D graphs were used for encoding protein structural information. A total of 35 amino acid 2D distance count (AA2DC) descriptors were calculated from the Euclidean distance matrices (EDM) derived from the 2D graphs at distances ranging from 0.05 to 1.8 units with a lag of 0.05 units. AA2DC descriptors were tested for building predictive classification model of the signs of the change of thermal unfolding Gibbs free energy change (Delta Delta G) of a large data set of 2048 single point mutations on 64 proteins. A support vector machine (SVM) classifier with a Radial Basis Function kernel was implemented for classifying the conformational stability of protein mutants. Temperature and pH of the Delta Delta G experimental measurements were also conveniently used for SVM training in addition to calculated AA2DC descriptors. The optimum SVM model correctly predicted about 72% of Delta Delta G signs in crossvalidation test for all the dataset and also for stable and unstable mutant separately. To the best of our knowledge, this level of accuracy for stable mutant recognition is the highest ever reported for a predictor using sequence information. Furthermore, the classifier adequately recognized unstable mutants of human prion protein and human transthyretin associated to diseases.
机译:来自蛋白质2D图的欧氏距离计数用于编码蛋白质结构信息。从2D图得出的欧几里德距离矩阵(EDM),在0.05到1.8单位之间的距离范围内,总共计算了35个氨基酸的2D距离计数(AA2DC)描述符,滞后为0.05单位。对AA2DC描述符进行了测试,以建立64个蛋白质上2048个单点突变的大数据集的热展开吉布斯自由能变化(Delta Delta G)变化的迹象的预测分类模型。实现了带有径向基函数核的支持向量机(SVM)分类器,用于对蛋白质突变体的构象稳定性进行分类。除了计算的AA2DC描述符外,Delta Delta G实验测量的温度和pH值也方便地用于SVM训练。最佳SVM模型在所有数据集以及稳定和不稳定突变体的交叉验证测试中正确预测了约Delta Delta G符号的72%。据我们所知,这种稳定的突变体识别的准确性水平是有史以来使用序列信息预测的最高水平。此外,该分类器充分识别了与疾病有关的人类ion病毒蛋白和人类运甲状腺素蛋白的不稳定突变体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号