首页> 外文期刊>PLoS One >Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
【24h】

Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network

机译:Rosetta:MSF:NN:用神经网络提高多态计算蛋白设计的性能

获取原文
       

摘要

Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.
机译:Rational蛋白质设计旨在有针对性修饰现有蛋白质。为了实现这一目标,像Rosetta这样的软件套件提出序列来引入所需的属性。具有挑战性的设计问题需要通过结构集合来表示蛋白质。因此,已经开发了Rosetta多状态设计(MSD)协议,其中每个状态代表一种蛋白质构象。 MSD协议的计算需求很高,因为对于每个候选序列,必须为所有状态创建和评估昂贵的三维(3D)模型。这些分数中的每一个都贡献一个数据点到复杂的设计特定的能量景观。由于神经网络(NN)证明了很适合了解此类解决方案空间,我们将一个集成到框架Rosetta:MSF而不是迄今为止使用的遗传算法,旨在降低计算成本。作为其前身,Rosetta:MSF:NN管理一组候选序列及其分数并迭代地扫描序列空间。在每次迭代期间,所有候选序列及其Rosetta分数的联盟用于重新列车,该标题具有特定于设计的架构。 NNS的巨大速度允许广泛评估替代序列,其排名在NN预测的分数上。昂贵的3D模型仅用于少量的最佳评分序列;这些和相应的基于3D的得分在每次迭代期间替换候选序列的一半。通过遗传算法对特定设计问题产生的两组候选序列的分析证实了NN预计基于3D的得分; Pearson相关系数至少为0.95。应用Rosetta:MSF:NN:eNZDE到由16个配体结合问题组成的基准测试表明,该协议会收敛于遗传算法的十倍倍,并找到具有可比分数的序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号