...
首页> 外文期刊>PLoS Computational Biology >Integrating Statistical Predictions and Experimental Verifications for Enhancing Protein-Chemical Interaction Predictions in Virtual Screening
【24h】

Integrating Statistical Predictions and Experimental Verifications for Enhancing Protein-Chemical Interaction Predictions in Virtual Screening

机译:集成统计预测和实验验证以增强虚拟筛选中的蛋白质-化学相互作用预测

获取原文
           

摘要

Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.
机译:在药物发现过程中,预测靶蛋白与潜在先导之间的相互作用非常有用。我们提出了一种适用于任何蛋白质与化合物之间相互作用的全面适用的统计预测方法,该方法仅需要蛋白质序列数据和化学结构数据,并利用支持向量机的统计学习方法。为了实现可能涉及许多误报的合理的综合预测,我们提出了两种减少误报的方法:(i)在两层支持向量机的框架内有效使用多个统计预测模型,以及(ii)合理设计负数据构建统计预测模型。在两层SVM中,由第一层SVM模型产生的输出(使用不同的负样本构建并反映分类的不同方面)被用作第二层SVM的输入。为了设计产生较少假阳性预测的阴性数据,我们从阳性和暂定阴性样本中迭代构造SVM模型或分类边界,并根据预定规则选择其他阴性样本候选项。此外,为了充分利用统计学习方法的优势,我们提出了一种策略,可以考虑感兴趣的生物学效应,有效地将实验结果反馈到计算预测中。我们展示了我们的方法在预测来自超过1900万种化合物的潜在配体与人雄激素受体结合以及通过体外结合验证这些预测方面的有效性。此外,我们利用此实验验证作为反馈来增强后续的计算预测,并再次通过实验验证这些预测。这种有效的计算机模拟迭代迭代方法以及具有足够反馈的体外或体内实验验证,使我们能够鉴定出与化学空间中已知配体不同的新型配体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号