首页> 外文OA文献 >Deep2Full: Evaluating strategies for selecting the minimal mutational experiments for optimal computational predictions of deep mutational scan outcomes
【2h】

Deep2Full: Evaluating strategies for selecting the minimal mutational experiments for optimal computational predictions of deep mutational scan outcomes

机译:Deep2full:评估选择最佳计算预测的最小突变实验的策略,用于深突变扫描结果的最佳计算预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Performing a complete deep mutational scan with all single point mutations may not be practical, and may not even be required, especially if predictive computational models can be developed. Computational models are however naive to cellular response in the myriads of assay-conditions. In a realistic paradigm of assay context-aware predictive hybrid models that combine minimal experimental data from deep mutational scans with structure, sequence information and computational models, we define and evaluate different strategies for choosing this minimal set. We evaluated the trivial strategy of a systematic reduction in the number of mutational studies from 85% to 15%, along with several others about the choice of the types of mutations such as random versus site-directed with the same 15% data completeness. Interestingly, the predictive capabilities by training on a random set of mutations and using a systematic substitution of all amino acids to alanine, asparagine and histidine (ANH) were comparable. Another strategy we explored, augmenting the training data with measurements of the same mutants at multiple assay conditions, did not improve the prediction quality. For the six proteins we analyzed, the bin-wise error in prediction is optimal when 50-100 mutations per bin are used in training the computational model, suggesting that good prediction quality may be achieved with a library of 500-1000 mutations.
机译:使用所有单点突变执行完整的深度突变扫描可能不是实用的,并且甚至可能甚至不需要,特别是如果可以开发预测计算模型。然而,计算模型对测定条件的植物中的细胞响应是天真的。在Assay的逼真范法中,通过结构,序列信息和计算模型将最小实验数据与结构,序列信息和计算模型组合,我们定义和评估选择这一最小集合的不同策略。我们评估了从85%到15%的突变研究数量的系统减少的琐碎策略,以及其他几种关于选择类型的突变类型,例如随机与站点的选择,具有相同的15%数据完整性。有趣的是,通过对随机突变的培训和使用所有氨基酸的系统取代对丙氨酸,天冬酰胺和组氨酸(AnH)进行预测性能力是可比的。我们探索的另一种策略,增强了在多个测定条件下测量相同突变体的测量,并未提高预测质量。对于我们分析的六种蛋白质,当在训练计算模型中使用50-100次突变时,预测中的预测误差是最佳的,这表明可以通过500-1000个突变的文库实现良好的预测质量。

著录项

  • 作者

    C. K. Sruthi; Meher Prakash;

  • 作者单位
  • 年度 2020
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 入库时间 2022-08-20 22:01:13

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号