...
首页> 外文期刊>Journal of chemical theory and computation: JCTC >Bayesian Active Learning for Optimization and Uncertainty Quantification in Protein Docking
【24h】

Bayesian Active Learning for Optimization and Uncertainty Quantification in Protein Docking

机译:贝叶斯的积极学习在蛋白质对接中的优化和不确定量化

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Ab initio protein docking represents a major challenge for optimizing a noisy and costly "black box"-like function in a high-dimensional space. Despite progress in this field, there is a lack of rigorous uncertainty quantification (UQ). To fill the gap, we introduce a novel algorithm, Bayesian active learning (BAL), for optimization and UQ of such black-box functions with applications to flexible protein docking. BAL directly models the posterior distribution of the global optimum (i.e., native structures) with active sampling and posterior estimation iteratively feeding each other. Furthermore, it uses complex normal modes to span a homogeneous, Euclidean conformation space suitable for high-dimensional optimization and constructs funnel-like energy models for quality estimation of encounter complexes. Over a protein-docking benchmark set and a CAPRI set including homology docking, we establish that BAL significantly improves against starting points from rigid docking and refinements by particle swarm optimization, providing a top-3 near-native prediction for one third targets. Quality assessment empowered with UQ leads to tight quality intervals with half range around 25% of the actual interface root-mean-square deviation and confidence level at 85%. BAL's estimated probability of a prediction being near-native achieves binary classification AUROC at 0.93 and area under the precision recall curve over 0.60 (compared to 0.50 and 0.14, respectively, by chance), which also improves ranking predictions. This study represents the first UQ solution for protein docking, with rigorous theoretical frameworks and comprehensive empirical assessments.
机译:AB Initio蛋白对接是优化在高维空间中的嘈杂和昂贵的“黑匣子”的主要挑战。尽管在该领域进展,但缺乏严格的不确定性量化(UQ)。为了填补差距,我们介绍了一种新颖的算法,贝叶斯主动学习(BAL),用于优化和UQ,这种黑箱功能适用于柔性蛋白质对接。 BAL直接模拟全球最佳(即,本机结构)的后部分布,其具有激活采样和后验估计迭代地互相喂养。此外,它采用复杂的正常模式来跨越适合于高维优化的均匀的欧几里德构象空间,并构建漏斗状能量模型,用于遇到遇到复合物的质量估计。在蛋白质对接基准组和包括同源对接的Capri集合中,我们建立了通过粒子群优化的刚性对接和改进的起始点来显着改善,为一个第三个目标提供了前3个近的天然预测。 UQ授权的质量评估导致尺寸紧密的间隔,占实际界面的25%左右的25%均匀平均方形偏差和85%的置信水平。 BAL的预测近乎临时的概率在0.93和精度召回曲线下的20.93和面积超过0.60(分别通过偶然的0.10和0.14),这也改善了排名预测。本研究代表了蛋白质对接的第一个UQ解决方案,具有严格的理论框架和综合实证评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号