首页> 外文会议>International Conference on Computational Intelligence Modeling Techniques and Applications >Analysis of features from protein-protein hetero-complex structures to predict protein interaction interfaces using machine learning
【24h】

Analysis of features from protein-protein hetero-complex structures to predict protein interaction interfaces using machine learning

机译:蛋白质 - 蛋白杂交复合结构的特征分析,通过机器学习预测蛋白质相互作用界面

获取原文

摘要

Protein-Protein-Interactions (PPIs) play the most important roles in most (if not all) of the biological processes. A few such examples include hormone-receptor binding, signal transduction, chaperone activity, antigen-antibody interactions. The disruptions of PPIs may therefore lead to the development of human inherited diseases. There are different analytical techniques to identify amino acid residues in protein interfaces. But they are time consuming, labour intensive and above all very expensive. As an alternative approach to the analytical methods, we have tried to develop machine learning tools to differentiate between protein interface and non-interface amino acid residues. We used sequence- and structure-based features derived from a set of protein hetero-complex structure files from the Protein Data Bank (PDB). We have built supervised predictors based on Random Forests (RF) and Support Vector Machines (SVMs). We have evaluated them with 10-fold cross-validations. Both of our sequence and structure based RF predictors performed better than SVM based ones. The most predictive sequence- and structure-based features are the attributes which measure sequence conservation at a specified amino acid residue and various other measurements of the amino acid residue's neighbouring charge distributions. Our sequence- and structure-based RF classifiers have been validated by evaluating them against the protein complexes with experimentally proven interaction sites. Our predictors are found to detect the protein interface residues in practice.
机译:蛋白质 - 蛋白质相互作用(PPI)在大多数(如果不是全部)的生物过程中起最重要的作用。一些这样的实例包括激素受体结合,信号转导,伴侣活性,抗原 - 抗体相互作用。因此,PPI的破坏可能导致人类遗传疾病的发展。存在不同的分析技术以鉴定蛋白质界面中的氨基酸残基。但它们是耗时的,劳动密集,高于全部昂贵。作为分析方法的替代方法,我们试图开发机器学习工具以区分蛋白质界面和非界面氨基酸残基。我们使用从蛋白质数据库(PDB)的一组蛋白质异质复合结构文件衍生的序列和结构的特征。我们基于随机林(RF)和支持向量机(SVM)构建了监督预测因子。我们已经用10倍的交叉验证评估了它们。我们的序列和基于结构的序列和结构的RF预测器两者都比基于SVM的RF预测器。最预测性的序列和结构的特征是测量在特定氨基酸残基的序列保存的属性和氨基酸残基的相邻电荷分布的各种其他测量。通过对通过实验证明的相互作用位点评估它们的蛋白质复合物来验证我们的序列和结构的RF分类剂。我们的预测器被发现在实践中检测蛋白质界面残留物。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号