Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting

Hao Wang; Chuyao Liu; Lei Deng

首页> 外文期刊>Scientific reports. >Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting

【24h】

Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting

机译：使用极端梯度增强功能增强对蛋白质-蛋白质界面热点的预测

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Identification of hot spots, a small portion of protein-protein interface residues that contribute the majority of the binding free energy, can provide crucial information for understanding the function of proteins and studying their interactions. Based on our previous method (PredHS), we propose a new computational approach, PredHS2, that can further improve the accuracy of predicting hot spots at protein-protein interfaces. Firstly we build a new training dataset of 313 alanine-mutated interface residues extracted from 34 protein complexes. Then we generate a wide variety of 600 sequence, structure, exposure and energy features, together with Euclidean and Voronoi neighborhood properties. To remove redundant and irrelevant information, we select a set of 26 optimal features utilizing a two-step feature selection method, which consist of a minimum Redundancy Maximum Relevance (mRMR) procedure and a sequential forward selection process. Based on the selected 26 features, we use Extreme Gradient Boosting (XGBoost) to build our prediction model. Performance of our PredHS2 approach outperforms other machine learning algorithms and other state-of-the-art hot spot prediction methods on the training dataset and the independent test set (BID) respectively. Several novel features, such as solvent exposure characteristics, second structure features and disorder scores, are found to be more effective in discriminating hot spots. Moreover, the update of the training dataset and the new feature selection and classification algorithms play a vital role in improving the prediction quality.

机译：热点的鉴定是蛋白质-蛋白质界面残基的一小部分，贡献了大部分结合自由能，可为了解蛋白质的功能和研究其相互作用提供重要信息。基于我们以前的方法（PredHS），我们提出了一种新的计算方法PredHS2，它可以进一步提高预测蛋白质-蛋白质界面热点的准确性。首先，我们建立了一个新的训练数据集，其中包含从34种蛋白质复合物中提取的313个丙氨酸突变的界面残基。然后，我们生成了600种序列，结构，曝光和能量特征，以及欧几里得和Voronoi邻域特性。为了删除冗余和不相关的信息，我们使用两步特征选择方法选择了一组26个最佳特征，其中包括最小冗余最大相关性（mRMR）过程和顺序前向选择过程。基于所选的26个功能，我们使用极端梯度增强（XGBoost）来构建我们的预测模型。我们的PredHS2方法的性能优于训练数据集和独立测试集（BID）上的其他机器学习算法和其他最新热点预测方法。发现一些新颖的特征，例如溶剂暴露特征，第二结构特征和无序分数，在区分热点方面更有效。此外，训练数据集的更新以及新的特征选择和分类算法在提高预测质量方面起着至关重要的作用。

著录项

来源
《Scientific reports.》 |2018年第1期|共页
作者
Hao Wang; Chuyao Liu; Lei Deng;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Prediction of hot spots in protein–DNA binding interfaces based on supervised isometric feature mapping and extreme gradient boosting [J] . Ke Li, Sijia Zhang, Di Yan, BMC Bioinformatics . 2020,第S13期

机译：基于监督等距映射的蛋白质-DNA绑定界面热斑的预测及极限梯度升压
2. XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting [J] . Lei Deng, Yuanchao Sui, Jingpu Zhang Genes . 2019,第3期

机译：XGBPRH：利用极端梯度增强预测蛋白质-RNA界面的结合热点
3. A semi-supervised boosting SVM for predicting hot spots at protein-protein Interfaces [J] . Bin Xu, Xiaoming Wei, Lei Deng, BMC Veterinary Research . 2012,第SUPPLEMENTa2期

机译：半监督增强SVM，用于预测蛋白质-蛋白质界面处的热点
4. Boosting Prediction Performance of Protein-Protein Interaction Hot Spots by Using Structural Neighborhood Properties (Extended Abstract) [C] . Lei Deng, Jihong Guan, Xiaoming Wei, Annual international conference on research in computational molecular biology . 2013

机译：利用结构邻域特性提高蛋白质-蛋白质相互作用热点的预测性能（扩展摘要）
5. Xtreme-NoC: Extreme Gradient Boosting Based Latency Model for Network-on-Chip Architectures [D] . Sheriff, Ilma. 2021

机译：Xtreme-Noc：基于极端梯度促进网络的片上架构延迟模型
6. Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting [O] . Hao Wang, Chuyao Liu, Lei Deng -1

机译：使用极端梯度增强功能增强对蛋白质-蛋白质界面热点的预测
7. Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting [O] . Hao Wang, Chuyao Liu, Lei Deng 2018

机译：使用极端梯度升压增强蛋白质 - 蛋白质界面的热点预测

Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting

摘要

著录项

相似文献

相关主题

期刊订阅