首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Interpretable linear and nonlinear quantitative structure-selectivity relationship (QSSR) modeling of a biomimetic catalytic system by particle swarm optimization based sparse regression
【24h】

Interpretable linear and nonlinear quantitative structure-selectivity relationship (QSSR) modeling of a biomimetic catalytic system by particle swarm optimization based sparse regression

机译:仿生催化系统基于稀疏回归的可解释线性和非线性定量结构-选择性关系(QSSR)建模

获取原文
获取原文并翻译 | 示例
           

摘要

A particle swarm optimization (PSO) based sparse regression (PSO-SR) strategy was proposed to study the quantitative structure-selectivity relationship (QSSR) of a biomimetic catalytic system, where the selectivity in the mild oxidation of o-nitrotoluene to o-nitrobenzaldehyde was related to the molecular descriptors of 48 metalloporphyrin catalysts. PSO was used to obtain an optimal variable combination for linear or nonlinear models. For nonlinear modeling, a set of 44 nonlinear transforms were developed for each single descriptor. To enable model interpretability and reduce the risk of overfitting, the total descriptors were divided into subclasses and the selected variables were forced to be sparsely distributed in each subclass. Model complexity was controlled by adjusting the maximum total number of variables included. Accurate linear and nonlinear PSO-SR models were developed using multiple linear regression (MLR) and partial least squares (PLS) and validated by randomly and repeatedly splitting the data into training and test objects for 500 times. The best predictions were obtained with 10 variables with linear (Q(2)=0.9460) and nonlinear (Q(2)=0.9505) models. The results indicate PSO-SR could provide an effective and useful strategy for modeling and interpreting complex QSSR problems. The proposed nonlinear modeling method could provide more information for model interpretation by probing and catching the unknown nonlinear relationship between a descriptor and the observed selectivity.
机译:提出了一种基于粒子群优化(PSO)的稀疏回归(PSO-SR)策略,以研究仿生催化系统的定量结构-选择性关系(QSSR),其中邻硝基甲苯温和氧化为邻硝基硝基苯甲醛的选择性与48种金属卟啉催化剂的分子描述有关。 PSO用于获得线性或非线性模型的最佳变量组合。对于非线性建模,为每个单个描述符开发了一组44个非线性变换。为了实现模型的可解释性并减少过度拟合的风险,将总描述符划分为子类,并强制将所选变量稀疏地分布在每个子类中。通过调整包括的最大变量总数来控制模型的复杂性。使用多元线性回归(MLR)和偏最小二乘(PLS)开发了精确的线性和非线性PSO-SR模型,并通过将数据随机重复重复进行500次训练和验证来进行验证。使用线性(Q(2)= 0.9460)和非线性(Q(2)= 0.9505)模型的10个变量可获得最佳预测。结果表明,PSO-SR可以为建模和解释复杂的QSSR问题提供有效且有用的策略。所提出的非线性建模方法可以通过探测和捕获描述符与观测到的选择性之间的未知非线性关系,为模型解释提供更多信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号