...
首页> 外文期刊>Environmental Pollution >Machine learning models accurately predict ozone exposure during wildfire events
【24h】

Machine learning models accurately predict ozone exposure during wildfire events

机译:机器学习模型准确地预测野火事件期间的臭氧暴露

获取原文
获取原文并翻译 | 示例
           

摘要

Epidemiologists use prediction models to downscale (i.e., interpolate) air pollution exposure where monitoring data is insufficient. This study compares machine learning prediction models for ground-level ozone during wildfires, evaluating the predictive accuracy of ten algorithms on the daily 8-hour maximum average ozone during a 2008 wildfire event in northern California. Models were evaluated using a leave-one-location-out cross-validation (LOLO CV) procedure to account for the spatial and temporal dependence of the data and produce more realistic estimates of prediction error. LOLO CV avoids both the well-known overly optimistic bias of k-fold cross-validation on dependent data and the conservative bias of evaluating prediction error over a coarser spatial resolution via leave-k-locations-out CV. Gradient boosting was the most accurate of the ten machine learning algorithms with the lowest LOLO CV estimated root mean square error (0.228) and the highest LOLO CV (R) over cap (2) (0.677). Random forest was the second best performing algorithm with an LOLO CV (R) over cap (2) of 0.661. The LOLO CV estimates of predictive accuracy were less optimistic than 10-fold CV estimates for all ten models. The difference in estimated accuracy between the 10-fold CV and LOLO CV was greater for more flexible models like gradient boosting and random forest. The order of estimated model accuracy depended on the choice of evaluation metric, indicating that 10-fold CV and LOLO CV may select different models or sets of covariates as optimal, which calls into question the reliability of 10-fold CV for model (or variable) selection. These prediction models are designed for interpolating ozone exposure, and are not suited to inferring the effect of wildfires on ozone or extrapolating to predict ozone in other spatial or temporal domains. This is demonstrated by the inability of the best performing models to accurately predict ozone during 2007 southern California wildfires. (C) 2019 Elsevier Ltd. All rights reserved.
机译:流行病学家使用预测模型到低档(即插值)空气污染暴露,监测数据不足。该研究比较了野火期间地面臭氧的机器学习预测模型,评估了2008年北加州野火活动期间每日8小时最大臭氧的10次算法的预测准确性。使用休假 - 一个位置交叉验证(LOLO CV)程序评估模型,以考虑数据的空间和时间依赖性,并产生更现实的预测误差估计。 LOLO CV避免了众所周知的k倍交叉验证的众所周知的克服横跨验证,以及通过休假空间分辨率评估预测误差的保守偏差通过休假 - k位置 - out cv。梯度提升是十大机器学习算法的最准确,具有最低的LOLO CV估计的根均方误差(0.228)和最高LOLO CV(R)上盖(2)(0.677)。随机森林是第二个最佳性能算法,LOLO CV(R)上帽(2)为0.661。对于所有十种模型,LOLO CV预测精度的估算比10倍的CV估计更乐观。 10倍CV和LOLO CV之间的估计精度的差异更大,对于更灵活的模型,如梯度升压和随机森林。估计模型精度的顺序取决于评估度量的选择,表明10倍的CV和LOLO CV可以选择不同的模型或协变量,作为最佳选择,这调用了模型(或变量或变量)的10倍CV的可靠性)选择。这些预测模型设计用于内插臭氧暴露,并且不适合推断野火对臭氧或外推以预测其他空间或时间域中的臭氧。这是通过最佳性能的模型来证明,在2007年加州南部的野火期间准确地预测臭氧。 (c)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号