...
首页> 外文期刊>Chemical engineering journal >Molecular image-convolutional neural network (CNN) assisted QSAR models for predicting contaminant reactivity toward OH radicals: Transfer learning, data augmentation and model interpretation
【24h】

Molecular image-convolutional neural network (CNN) assisted QSAR models for predicting contaminant reactivity toward OH radicals: Transfer learning, data augmentation and model interpretation

机译:分子图像卷积神经网络(CNN)辅助QSAR模型预测污染物反应对OH激进态的影响:转移学习,数据增强和模型解释

获取原文
获取原文并翻译 | 示例
           

摘要

In this study, we used molecular images as a representation for organic compounds and combined them with a convolutional neural network (CNN) to develop quantitative structure-activity relationships (QSARs) for predicting compound rate constants toward OH radicals. We applied transfer learning and data augmentation to train molecular image-CNN models and the Gradient-weighted Class Activation Mapping (Grad-CAM) method to interpret them. Results showed that data augmentation and transfer learning can effectively enhance the robustness and predictive performance of the models, with the root-mean-square-error (RMSE) values on the test dataset (RMSEtest) decreasing from (0.395-0.45) to (0.284-0.339) after applying data augmentation, and the RMSE on the training dataset (RMSEtrain) decreasing from (0.452-0.592) to (0.123-0.151) after applying transfer learning. The obtained molecular image-CNN models showed comparative predictive performance (RMSEtest 0.284-0.339) with the molecular fingerprint-based models (RMSEtest 0.30-0.35). Grad-CAM interpretation showed that the molecular image-CNN models correctly chose the molecular features in the images and identified key functional groups that influenced the reactivity. The applicability domain analysis showed that the molecular image-CNN models have a broader applicability domain than molecular fingerprints-based models and the reactivity of any new compounds with a maximum similarity of over 0.85 to the compounds in the training dataset can be reliably predicted. This study demonstrated that molecular image-CNN is a new tool to develop QSARs for environmental applications and can be used to build trustful models that make meaningful predictions.
机译:在这项研究中,我们使用分子图像作为有机化合物的表示,并将其与卷积神经网络(CNN)相结合,建立定量结构-活性关系(QSAR),以预测化合物对OH自由基的速率常数。我们应用转移学习和数据增强来训练分子图像CNN模型,并用梯度加权类激活映射(Grad-CAM)方法来解释它们。结果表明,数据增强和迁移学习可以有效地提高模型的鲁棒性和预测性能,应用数据增强后,测试数据集(RMSEtest)上的均方根误差(RMSE)值从(0.395-0.45)降低到(0.284-0.339),应用迁移学习后,训练数据集(RMSEtrain)上的RMSE从(0.452-0.592)降至(0.123-0.151)。获得的分子图像CNN模型显示出与基于分子指纹的模型(RMSEtest 0.30-0.35)相比的预测性能(RMSEtest 0.284-0.339)。Grad-CAM解释表明,分子图像CNN模型正确地选择了图像中的分子特征,并确定了影响反应性的关键官能团。适用域分析表明,与基于分子指纹的模型相比,分子图像CNN模型具有更广泛的适用域,并且可以可靠地预测与训练数据集中的化合物最大相似性超过0.85的任何新化合物的反应性。这项研究表明,分子图像CNN是一种开发环境应用QSAR的新工具,可以用来建立可信的模型,做出有意义的预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号