General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity

Liu Ruifeng; Glover Kyle P.; Feasel Michael G.; Wallqvist Anders

首页> 外文期刊>Journal of chemical information and modeling >General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity

【24h】

General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity

机译：估算分子活性定量结构活性关系预测的误差条的一般方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Key requirements for quantitative structure-activity relationship (QSAR) models to gain acceptance by regulatory authorities include a defined domain of applicability (DA) and appropriate measures of goodness-of-fit, robustness, and predictivity. Hence, many DA metrics have been developed over the past two decades. The most intuitive are perhaps distance-to-model metrics, which are most commonly defined in terms of the mean distance between a molecule and its k nearest training samples. Detailed evaluations have shown that the variance of predictions by an ensemble of QSAR models may serve as a DA metric and can outperform distance-to-model metrics. Intriguingly, the performance of ensemble variance metric has led researchers to conclude that the error of predicting a new molecule does not depend on the input descriptors or machine-learning methods but on its distance to the training molecules. This implies that the distance to training samples may serve as the basis for developing a high-performance DA metric. In this article, we introduce a new Tanimoto distance-based DA metric called the sum of distance-weighted contributions (SDC), which takes into account contributions from all molecules in a training set. Using four acute chemical toxicity data sets of varying sizes and four other molecular property data sets, we demonstrate that SDC correlates well with the prediction error for all data sets regardless of the machine-learning methods and molecular descriptors used to build the QSAR models. Using the acute toxicity data sets, we compared the distribution of prediction errors with respect to SDC, the mean distance tok-nearest training samples, and the variance of random forest predictions. The results showed that the correlation with the prediction error was highest for SDC. We also demonstrate that SDC allows for the development of robust root mean squared error (RMSE) models and makes it possible to not only give a QSAR prediction but also provide an individu

机译：定量结构-活性关系（QSAR）模型获得监管机构认可的关键要求包括定义的适用范围（DA）以及拟合优度、稳健性和预测性的适当度量。因此，在过去的二十年中，许多DA指标都得到了发展。最直观的可能是到模型的距离度量，它最常见的定义是分子与其k个最近训练样本之间的平均距离。详细评估表明，QSAR模型集合的预测方差可以作为DA度量，并且可以优于模型距离度量。有趣的是，集合方差度量的性能使研究人员得出结论，预测新分子的误差不取决于输入描述符或机器学习方法，而是取决于它与训练分子的距离。这意味着到训练样本的距离可以作为开发高性能DA度量的基础。在本文中，我们介绍了一种新的基于Tanimoto距离的DA度量，称为距离加权贡献之和（SDC），它考虑了训练集中所有分子的贡献。使用四个不同大小的急性化学毒性数据集和四个其他分子特性数据集，我们证明，无论用于构建QSAR模型的机器学习方法和分子描述符如何，SDC与所有数据集的预测误差都具有良好的相关性。利用急性毒性数据集，我们比较了SDC预测误差的分布、最近训练样本的平均距离以及随机森林预测的方差。结果表明，SDC与预测误差的相关性最高。我们还证明，SDC允许发展稳健的均方根误差（RMSE）模型，并使其不仅可以给出QSAR预测，而且可以提供个性化的

著录项

来源
《Journal of chemical information and modeling》 |2018年第8期|共15页
作者
Liu Ruifeng; Glover Kyle P.; Feasel Michael G.; Wallqvist Anders;
展开▼
作者单位

US Army Med Res &

Mat Command US Dept Def Biotechnol High Performance Comp Software Applica Telemed &

Adv Technol Res Ctr Ft Detrick MD 21702 USA;

Def Threat Reduct Agcy Aberdeen Proving Ground MD 21010 USA;

US Army Edgewood Chem Biol Ctr Operat Toxicol Aberdeen Proving Ground MD 21010 USA;

US Army Med Res &

Mat Command US Dept Def Biotechnol High Performance Comp Software Applica Telemed &

Adv Technol Res Ctr Ft Detrick MD 21702 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类化学;化学工业;
关键词

相似文献

外文文献
中文文献
专利

1. Prediction of Stability for Polychlorinated Biphenyls in Transformer Insulation Oil Through Three-dimensional Quantitative Structure-activity Relationship Pharmacophore Model and Full Factor Experimental Design [J] . XU Zheng, CHEN Ying, QIU Youli, 高等学校化学研究（英文版） . 2016,第003期
2. Prediction of Toxicity of Phenols and Anilines to Algae by Quantitative Structure-activity Relationship [J] . GUANG-HUA LU, CHAO WANG, XIAO-LING GUO 生物医学与环境科学（英文版） . 2008,第003期
3. Predicting quantitative structure-activity relationship of substituted 17α-acetoxyprogesterones by molecular hybridization electronegativity-distance vector [J] . SUN Li-li, LAN Yu-kun, ZHOU Li-ping, 重庆大学学报（英文版） . 2007,第002期
4. General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity [J] . Liu Ruifeng, Glover Kyle P., Feasel Michael G., Journal of chemical information and modeling . 2018,第8期

机译：估算分子活性定量结构活性关系预测的误差条的一般方法
5. DISCOVERY OF NOVEL AND SELECTIVE C-JUN NH2-TERMINAL KINASES 2 INHIBITORS BY TWO-DIMENSIONAL QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIP MODEL DEVELOPMENT, MOLECULAR DOCKING AND ABSORPTION, DISTRIBUTION, METABOLISM, ELIMINATION PREDICTION STUDIES: AN IN SILICO APPROACH [J] . Ashima Nagpal, Sarvesh Paliwal Asian Journal of Pharmaceutical and Clinical Research . 2018,第5期

机译：通过二维定量结构-活性关系模型开发，分子对接和吸收，分布，代谢，消除预测研究发现新型和选择性的C-Jun NH2终端激酶2抑制剂
6. Three-Dimensional Quantitative Structure-Activity Relationships and Activity Predictions of Human TRPV1 Channel Antagonists:Comparative Molecular Field Analysis and Comparative Molecular Similarity Index Analysis of Cinnamides [J] . Vellarkad N.Viswanadhan, Yaxiong Sun, Mark H.Norman Journal of Medicinal Chemistry . 2007,第23期

机译：人TRPV1通道拮抗剂的三维定量构效关系和活性预测：肉桂的比较分子场分析和比较分子相似性指标分析
7. The influence of molecular lowest-energy conformation on the quality of the subsequent quantitative structure-activity relationship models [C] . Jiazhong Li, Juanjuan He, Beilei Lei, 2013 International conference on computational sciences and engineering . 2013

机译：分子最低能量构象对后续定量构效关系模型质量的影响
8. The development of quantitative structure-activity relationship models for physical property and biological activity prediction of organic compounds. [D] . Mattioni, Brian E. 2003

机译：建立有机化合物物理性质和生物活性预测的定量构效关系模型。
9. A Novel Approach for a Toxicity Prediction Model of Environmental Pollutants by Using a Quantitative Structure-Activity Relationship Method Based on Toxicogenomics [O] . Junichi Hosoya, Kumiko Tamura, Naomi Muraki, 2011

机译：基于毒物基因组学的定量构效关系法建立环境污染物毒性预测模型的新方法
10. HomoSAR: An Integrated Approach Using Homology Modeling and Quantitative Structure-Activity Relationship for Activity Prediction of Peptides [O] . Raghuvir R. S. Pissurlenkar, Evans C. Coutinho 2008

机译：Homosar：一种使用同源性建模和定量结构 - 活性关系的综合方法，用于肽的活性预测
11. Quantitative Structure-Activity Relationships of Beta-Adrenergic Agents. Application of the Computer Automated Structure Evaluation (CASE) Technique of Molecular Fragment Recognition [R] . Klopman, G., Kalos, A. N. 1986

机译：β-肾上腺素能药物的定量构效关系。分子片段识别计算机自动结构评估（CasE）技术的应用

General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅