首页> 外文期刊>Proteins: Structure, Function, and Genetics >Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility.
【24h】

Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility.

机译:从序列和序列衍生的残基灵活性和溶剂可及性准确预测蛋白质折叠速率。

获取原文
获取原文并翻译 | 示例
       

摘要

Protein folding rates vary by several orders of magnitude and they depend on the topology of the fold and the size and composition of the sequence. Although recent works show that the rates can be predicted from the sequence, allowing for high-throughput annotations, they consider only the sequence and its predicted secondary structure. We propose a novel sequence-based predictor, PFR-AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence, to improve predictions and provide insights into the folding process. The predictor includes three linear regressions for proteins with two-state, multistate, and unknown (mixed-state) folding kinetics. PFR-AF on average outperforms current methods when tested on three datasets. The proposed approach provides high-quality predictions in the absence of similarity between the predicted and the training sequences. The PFR-AF's predictions are characterized by high (between 0.71 and 0.95, depending on the dataset) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates, as measured using out-of-sample tests. Our models reveal that for the two-state chains inclusion of solvent-exposed Ala may accelerate the folding, while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facilitates faster folding and that proteins with larger content of solvent-exposed strands may fold at a slower pace. The increased flexibility of the solvent-exposed residues is shown to elongate folding, which also holds, with a lower correlation, for buried residues. Two case studies are included to support our findings.
机译:蛋白质折叠速率变化了几个数量级,它们取决于折叠的拓扑以及序列的大小和组成。尽管最近的工作表明可以从序列中预测速率,并允许进行高通量注释,但他们仅考虑序列及其预测的二级结构。我们提出了一种新颖的基于序列的预测子,PFR-AF,它利用了从序列预测的溶剂可及性和残基柔性,来改善预测并提供对折叠过程的洞察力。该预测变量包括具有两种状态,多状态和未知(混合状态)折叠动力学的蛋白质的三个线性回归。在三个数据集上进行测试时,PFR-AF平均优于当前方法。所提出的方法在预测序列与训练序列之间没有相似性的情况下提供了高质量的预测。 PFR-AF的预测具有较高的相关性(在0.71和0.95之间,取决于数据集),而相对最低的平均绝对误差(在0.75和0.9之间)是相对于实验速率的平均值,这是使用样本外测试测得的。我们的模型表明,对于含两个分子的状态链,暴露于溶剂中的丙氨酸可能会加速折叠,而Ile含量的增加可能会降低折叠速度。我们还证明,提高线圈的柔韧性有助于更快地折叠,并且溶剂暴露链的含量较高的蛋白质可能以较慢的速度折叠。暴露于溶剂的残留物增加的柔韧性显示出伸长的折叠,这对于掩埋残留物也具有较低的相关性。包括两个案例研究以支持我们的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号