首页> 外文会议>Asia-Pacific Bioinformatics Conference >PredRSA: a gradient boosted regressiontrees approach for predicting protein solvent accessibility
【24h】

PredRSA: a gradient boosted regressiontrees approach for predicting protein solvent accessibility

机译:predrsa:一种用于预测蛋白质溶剂可接近性的梯度提升回归途径

获取原文

摘要

Background: Protein solvent accessibility prediction is a pivotal intermediate step towards modeling protein tertia structures directly from one-dimensional sequences. It also plays an important part in identifying protein folds and domains. Althoughsome methods have been presented to the protein solvent accessibility prediction in recent yeai the performance is far from satisfactory. In this work, we propose PredRSA, a computational method that can accurately predict relative solvent accessible surface area (RSA) of residues by exploring various local and global sequence features which have been observed to be associated with solvent accessibility. Based on these features, a novel and efficient approach, Gradient Boosted Regression Trees (GBRT), is first adopted to predict RSA.Results: Experimental results obtained from 5-fold cross-validation based on the Manesh-215 dataset show that the mean absolute error (MAE) and the Pearson correlation coefficient (PCC) of PredRSA are 9.0 % and 0.75, respectively, which are better than that of the existing methods. Moreover, we evaluate the performance of PredRSA using an independent test set of 68 proteins. Compared with the state-of-the-art approaches (SPINE-X and ASAquick), PredRS achieves a significant improvement on the predictionquality.Conclusions: Our experimental results show that the Gradient Boosted Regression Trees algorithm and the novel feature combination are quite effective in relative solvent accessibility prediction. The proposed PredRSA method could be useful in assisting the prediction of protein structures by applying the predicted RSA as useful restraints.
机译:背景技术蛋白质溶剂可偏转性预测是朝向直接从一维序列建模蛋白质条目结构的枢转中间步骤。它还在鉴定蛋白质折叠和域中起着重要的部分。虽然已经呈现给近期蛋白质溶剂可访问性预测,但近年来的性能远非令人满意。在这项工作中,我们提出了通过探索已经观察到的各种局部和全局序列特征来准确地预测残留物的相对溶剂可接近表面积(RSA)的计算方法。基于这些特征,首先采用新颖且有效的方法,梯度提升回归树(GBRT)来预测RSA.Results:从基于MANESH-215数据集的5倍交叉验证获得的实验结果显示平均值误差(MAE)和PERPRSA的PEARSON相关系数(PCC)分别为9.0%和0.75,比现有方法更好。此外,我们使用68蛋白的独立测试组评估Predrsa的性能。与最先进的方法(Spine-X和Asaquick)相比,Predrs实现了预测定性的显着改善。结论:我们的实验结果表明,梯度提升回归树算法和新颖的特征组合非常有效相对溶剂可访问性预测。所提出的PEDRSA方法可用于通过将预测的RSA应用于有用的限制来辅助蛋白质结构预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号