首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
【24h】

Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors

机译:k均值聚类,线性判别分析和多元线性回归在建立5-脂氧合酶抑制剂预测QSAR模型中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that pilD (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are R-train = 0.811 and R-test = 0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features. (C) 2015 Elsevier B.V. All rights reserved.
机译:在这项工作中,我们使用k均值聚类和线性判别分析(LDA)来选择训练和测试集以及多元变量,对5-脂氧合酶(5-LOX)抑制剂家族进行了定量结构活性关系(QSAR)模型自变量选择的线性回归(MLR)。使用k-均值聚类方法,根据简单的判别函数,将全部化合物(5-苄叉基-2-苯基噻唑啉酮的58个衍生物)分为两个簇。我们发现pilD(常规键序ID号)分子描述符正确地区分了每个簇的100%的化合物。分析了分为三个系列的30种不同模型,具有代表性的训练和测试集的系列(系列3)的预测模型最多。最佳模型的统计参数为R-train = 0.811和R-test = 0.801。我们发现,在训练和测试集的设置中进行合理选择可以获取最具预测性的模型,而随机选择有时是不合适的,尤其是当可以根据结构特征将全部化合物分类为不同的簇时。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号