...
首页> 外文期刊>Journal of chemical information and modeling >Applications of Machine Learning to In Silico Quantification o Chemicals without Analytical Standards
【24h】

Applications of Machine Learning to In Silico Quantification o Chemicals without Analytical Standards

机译:机器学习在硅量化O化学物质的应用没有分析标准

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Non-targeted analysis provides a comprehensive approach to analyze environmental and biological samples for nearly all chemicals present. One of the main shortcomings of current analytical methods and workflows is that they are unable to provide any quantitative information constituting an important obstade in understanding environmental fate and human exposure. Herein, we present an in silico quantification method using mahine-learning for chemicals analyzed using electrospray ionization (ESI). We considered three data sets from different instrumental setups: (i) capillary electrophoresis electrospray ionization-mass spectrometry (CE-MS) in positive ionization mode (ESI+), (ii) liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QTOF/MS) in ESI+ and (iii) LCQTOF/MS in negative ionization mode (ESI-). We developed and applied two different machine-learning algorithms: a random forest (RF) and an artificial neural network (ANN) to predict the relative response factors (RRFs) of different chemicals based on their physicochemical properties. Chemical concentrations can then be calculated by dividing the measured abundance of a chemical, as peak area or peak height, by its corresponding RRF. We evaluated our models and tested their predictive power using 5-fold cross-validation (CV) and y randomization. Both the RF and the ANN models showed great promise in predicting RRFs. However, the accuracy of the predictions was dependent on the data set composition and the experimental setup. For the CE-MS ESI+ data set, the best model predicted measured RRFs with a mean absolute error (MAE) of 0.19 log units and a cross-validation coefficient of determination (Q(2)) of 0.84 for the testing set. For the LC-QTOF/MS ESI+ data set, the best model predicted measured RRFs with an MAE of 0.32 and a Q(2) of 0.40. For the LC-QTOF/MS ESI- data set, the best model predicted measured RRFs with a MAE of 0.50 and a Q(2) of 0.20. Our findings suggest that machine-learning algorithms can be used for predicting concentrations of nontargeted chemicals with reasonable uncertainties, especially in ESI+, while the application on ESI- remains a more challenging problem.
机译:非靶向分析提供了一种综合方法,用于分析环境和生物样品的几乎所有的化学品。目前分析方法和工作流程的主要缺点之一是,他们无法提供构成重要障碍在理解环境命运和人体暴露方面的任何量化信息。在此,我们使用Mahine学习使用电喷雾电离(ESI)分析的化学品中的硅定量方法。我们认为来自不同仪器设置的三种数据集:(i)毛细管电泳电泳电泳电离 - 质谱(CE-MS)以正电离模式(ESI +),(ii)液相色谱四极针飞行时间质谱法(LC-QTOF / MS)在ESI +和(III)LCQTOF / MS中的负电离模式(ESI-)。我们开发并应用了两种不同的机器学习算法:一种随机森林(RF)和人工神经网络(ANN),以预测基于其物理化学性质的不同化学品的相对响应因子(RRF)。然后可以通过将测量的化学品,作为相应的RRF分离为峰值区域或峰值高度来计算化学浓度。我们评估了我们的模型,并使用5倍交叉验证(CV)和Y随机化进行预测力。 RF和ANN模型都在预测RRF时表现出很大的承诺。然而,预测的准确性取决于数据集合和实验设置。对于CE-MS ESI +数据集,最佳模型预测测量的RRF具有0.19日志单元的平均绝对误差(MAE)和用于测试集的0.84的横跨验证系数(Q(2))。对于LC-QTOF / MS ESI +数据集,最佳模型预测测量的RRF,MAE为0.32,Q(2)为0.40。对于LC-QTOF / MS ESI数据集,最佳模型预测测量的RRF,MAE为0.50,Q(2)为0.20。我们的研究结果表明,机器学习算法可用于预测与合理的不确定性的非目标化学品的浓度,特别是在ESI +中,而在ESI上的应用仍然是一个更具挑战性的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号