首页> 外文期刊>Journal of Chemometrics >Combination of kernel PCA and linear support vector machine for modeling a nonlinear relationship between bioactivity and molecular descriptors
【24h】

Combination of kernel PCA and linear support vector machine for modeling a nonlinear relationship between bioactivity and molecular descriptors

机译:核PCA和线性支持向量机的组合,用于建模生物活性和分子描述子之间的非线性关系

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, a two-step nonlinear classification algorithm is proposed to model the structure-activity relationship (SAR) between bioactivities and molecular descriptors of compounds, which consists of kernel principal component analysis (KPCA) and linear support vector machines (KPCA + LSVM). KPCA is used to remove some uninformative gradients such as noises and then exactly capture the latent structure of the training dataset using some new variables called the principal components in the kernel-defined feature space. LSVM makes full use of the maximal margin hyperplane to give the best generalization performance in the KPCA-transformed space. The combination of KPCA and LSVM can effectively improve the prediction performance compared with the linear SVM as well as two nonlinear methods. Three datasets related to different categorical bioactivities of compounds are used to evaluate the performance of KPCA + LSVM. The results show that our algorithm is competitive.
机译:本文提出了一种两步非线性分类算法,用于建模生物活性和化合物分子描述子之间的构效关系(SAR),该算法由核主成分分析(KPCA)和线性支持向量机(KPCA + LSVM)组成)。 KPCA用于去除一些非信息性的梯度(例如噪声),然后使用一些在内核定义的特征空间中称为主成分的新变量,精确捕获训练数据集的潜在结构。 LSVM充分利用了最大余量超平面,以在KPCA变换的空间中提供最佳的泛化性能。与线性SVM和两种非线性方法相比,KPCA和LSVM的组合可以有效地提高预测性能。与化合物的不同分类生物活性有关的三个数据集用于评估KPCA + LSVM的性能。结果表明,我们的算法具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号