首页> 外文期刊>Journal of chemical information and modeling >Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty
【24h】

Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty

机译:用于定量结构 - 活动关系建模 - 量化预辨别的保形回归

获取原文
获取原文并翻译 | 示例
           

摘要

Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the resultant prediction intervals to create as efficient (i.e., narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges, and the different approaches were evaluated on 29 publicly available data sets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals, but other approaches were almost as efficient. This approach afforded an average prediction range of 1.65 pIC50 units at the 80% confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.
机译:通过促进决策和资源优先级,使得具有相关信心的预测是非常可取的。共形回归是一种机器学习框架,其允许用户定义所需的置信度并提供保证对所选范围正确的预测。在这项研究中,我们应用了模拟分子特性和生物活性值的共形回归,并调查不同的方式来规模所得的预测间隔,以创造尽可能有效(即,窄的)回归。用于估计预测不确定性的不同算法用于归一化预测范围,并且在29个公共数据集上评估不同的方法。我们的研究结果表明,当使用与底层随机林中的集合标准偏差的自然指数衡量预测间隔时,获得了最有效的保形回归,但其他方法几乎是高效的。这种方法在应用于生物活性建模时,在80%的置信水平下平均预测范围为1.65 PIC50单元。非圆形函数的选择对平均预测范围具有明显的影响,差异接近一个在最密封和最宽的预测范围之间的生物活动中的一个日志单元。总体而言,保形回归是一种强大的方法,可以利用相关的信心生成生物活性预测。

著录项

  • 来源
  • 作者单位

    Univ Cambridge Ctr Mol Informat Dept Chem Lensfield Rd Cambridge CB2 1EW England;

    Univ Cambridge Ctr Mol Informat Dept Chem Lensfield Rd Cambridge CB2 1EW England;

    Karolinska Inst Unit Toxicol Sci Swetox Forskargatan 20 SE-15136 Sodertalje Sweden;

    Univ Cambridge Ctr Mol Informat Dept Chem Lensfield Rd Cambridge CB2 1EW England;

    Uppsala Univ Dept Pharmaceut Biosci Box 591 SE-75124 Uppsala Sweden;

    AstraZeneca IMED Biotech Unit Discovery Sci Quantitat Biol SE-43183 Molndal Sweden;

    Univ Cambridge Ctr Mol Informat Dept Chem Lensfield Rd Cambridge CB2 1EW England;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学;化学工业;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号