首页> 外文期刊>Current drug discovery technologies >QSAR Modeling of Carcinogenic Risk Using Discriminant Analysis and Topological Molecular Descriptors
【24h】

QSAR Modeling of Carcinogenic Risk Using Discriminant Analysis and Topological Molecular Descriptors

机译:基于判别分析和拓扑分子描述的QSAR致癌风险建模

获取原文
获取原文并翻译 | 示例
           

摘要

A discriminant analysis model is presented for carcinogenic risk. The data set is obtained from the two-year rodent study FDA/CDER database and was divided into a training set of 1022 organic compounds and an external validation test set of 50 compounds. The model is designed to use as a decision support tool for a defined decision threshold, and is thus a binary discrimination into "high risk" and "low risk" categories. The carcinogenic risk classification is based on the method for estimating human risk from two-year rodent studies developed at the FDA/CDER/ICSAS. The paradigm chosen for this model allows a straightforward risk analysis based on historic information, as well as the computation of coverage, probability and confidence metrics that can further qualify the computed result. The molecular structures were represented as MDL mol files. The molecular structure information was obtained as topological structure descriptors, including atom-type and group-type E-State and hydrogen E-State indices, molecular connectivity chi indices, topological polarity, and counts of molecular features. The MDL~RQSAR software computed all these descriptors. Furthermore, the discriminant analyses were all performed with the MDL~RQSAR software. The reported model is based on fifty-three descriptors, using the nonparametric normal kernel method and the Mahalanobis distance to determine proximity. The model performed very well on the fifty compounds of the test set, yielding the following statistics: 76% correctly classified "high risk" (carcinogenic) and 84% correctly classified as "low risk" (non-carcinogenic).
机译:提出了针对致癌风险的判别分析模型。该数据集是从两年的啮齿动物研究FDA / CDER数据库获得的,分为1022种有机化合物的训练集和50种化合物的外部验证测试集。该模型旨在用作定义的决策阈值的决策支持工具,因此是对“高风险”和“低风险”类别的二进制区分。致癌风险分类基于FDA / CDER / ICSAS开展的为期两年的啮齿类动物研究估算人类风险的方法。为该模型选择的范式允许基于历史信息进行直接的风险分析,以及可以进一步限定计算结果的范围,概率和置信度度量的计算。分子结构表示为MDL mol文件。获得分子结构信息作为拓扑结构描述符,包括原子型和基团型E-State和氢E-State指数,分子连通性chi指数,拓扑极性和分子特征数。 MDL〜RQSAR软件计算所有这些描述符。此外,所有判别分析均使用MDL〜RQSAR软件进行。报告的模型基于53个描述符,使用非参数正态核方法和马氏距离确定邻近度。该模型在测试集的五十种化合物上表现非常好,得出以下统计数据:76%正确分类为“高风险”(致癌)和84%正确分类为“低风险”(非致癌)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号