首页> 外文学位 >Prediction of chemical properties and biological activities of organic compounds from molecular structure and use of probabilistic and generalized regression neural networks.
【24h】

Prediction of chemical properties and biological activities of organic compounds from molecular structure and use of probabilistic and generalized regression neural networks.

机译:从分子结构以及概率和广义回归神经网络的使用来预测有机化合物的化学性质和生物活性。

获取原文
获取原文并翻译 | 示例

摘要

This thesis describes the development of and methodology used to obtain quantitative structure-activity relationships (QSAR) for several different sets of compounds. QSAR models provide statistical and often meaningful and interpretable relationships between the physical characteristics of molecules and their observed activities. The QSAR model building process used to develop the models presented in this thesis are described. Aspects of molecular representation and modeling are discussed. This is followed by a discussion of the ways in which various aspects of molecular structure may be encoded through the use of topological, geometric, electronic and polar surface area descriptors. The process of selecting pertinent descriptor subsets using the stochastic optimization methods of genetic algorithms (GA) and generalized simulated annealing (GSA) is outlined. The GA and GSA are used with multiple linear regression (MLR), computational neural networks (CNN) or generalized regression neural networks (GRNN) to find high-quality quantitative models, and with linear discriminant analysis (LDA), k-nearest neighbors analysis (k-NN), and probabilistic neural networks (PNN) to find high-quality classification models. Each model presented is validated using a set of compounds that was not used to build the models.; The theory of the PNN and its close relative, the GRNN, are discussed in detail. Effective PNN models are presented that identify molecules as potential human soluble epoxide hydrolase inhibitors using a binary classification scheme. A GRNN model is presented that predicts the aqueous solubility of nitrogen- and oxygen-containing small organic molecules. For the applications presented, the predictive power of the PNN and GRNN models is found to be equivalent to previously examined methodologies such as k-NN classification and MLFN function approximation, but requiring significantly fewer input descriptors.; Predictive quantitative structure-property relationships (QSPRs) are presented that link topological molecular structure and derived amino acid parameters with the ion mobility spectrometry collision cross sections of a set of 113 singly-protonated, lysine-terminated peptides from a tryptic digest of common proteins. A trivial linear model using only the number of atoms as an independent variable is able to predict 88 of 113 peptide collision cross sections (78%) to within 2% of their experimentally determined value. (Abstract shortened by UMI.)
机译:本文描述了几种不同化合物的定量构效关系(QSAR)的获得方法和方法。 QSAR模型提供了分子的物理特征与其观察到的活动之间的统计关系,并且通常是有意义且可解释的关系。描述了用于开发本文提出的模型的QSAR模型构建过程。讨论了分子表示和建模方面。接下来是对可以通过使用拓扑,几何,电子和极性表面积描述符对分子结构各个方面进行编码的方式的讨论。概述了使用遗传算法(GA)和广义模拟退火(GSA)的随机优化方法选择相关描述符子集的过程。 GA和GSA与多元线性回归(MLR),计算神经网络(CNN)或广义回归神经网络(GRNN)结合使用,以找到高质量的定量模型,并与线性判别分析(LDA),k近邻分析(k-NN)和概率神经网络(PNN)查找高质量的分类模型。所提供的每个模型都使用一组未用于构建模型的化合物进行验证。详细讨论了PNN及其近亲GRNN的理论。提出了有效的PNN模型,该模型使用二元分类方案将分子识别为潜在的人类可溶性环氧水解酶抑制剂。提出了GRNN模型,该模型可预测含氮和氧的有机小分子的水溶性。对于提出的应用,发现PNN和GRNN模型的预测能力与先前研究的方法(例如k-NN分类和MLFN函数逼近)等效,但所需的输入描述符却少得多。提出了预测的定量结构-性质关系(QSPR),该关系将拓扑分子结构和派生的氨基酸参数与离子迁移谱碰撞截面中的一组113个单质子化,赖氨酸末端的肽的横截面联系起来,这些肽来自普通蛋白质的胰蛋白酶消化。仅使用原子数作为独立变量的简单线性模型就可以预测113个肽段碰撞截面中的88个(78%)在其实验确定值的2%以内。 (摘要由UMI缩短。)

著录项

  • 作者

    Mosier, Philip D.;

  • 作者单位

    The Pennsylvania State University.;

  • 授予单位 The Pennsylvania State University.;
  • 学科 Chemistry General.
  • 学位 Ph.D.
  • 年度 2003
  • 页码 284 p.
  • 总页数 284
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号