首页> 外文期刊>Earth and Space Science >Modeling Air Pollution, Climate, and Health Data Using Bayesian Networks: A Case Study of the English Regions
【24h】

Modeling Air Pollution, Climate, and Health Data Using Bayesian Networks: A Case Study of the English Regions

机译:使用贝叶斯网络对空气污染,气候和健康数据进行建模:以英国地区为例

获取原文
           

摘要

The link between pollution and health is commonly explored by trying to identify the dominant cause of pollution and its most significant effect on health outcomes. The use of multivariate features to describe exposure is less explored because investigating a large domain of scenarios is theoretically (i.e., interpretation of results) and technically (i.e., computational effort) challenging. In this work we explore the use of Bayesian Networks with a multivariate approach to identify the probabilistic dependence structure of the environment‐health nexus. This consists of environmental factors (topography and climate), exposure levels (concentration of outdoor air pollutants), and health outcomes (mortality rates). The information is collated with regard to a data‐rich study area: the English regions (UK), which incorporate environmental types that are different in character from urban to rural. We implemented a reproducible workflow in the R programming language to collate environment‐health data and analyze almost 50 millions of observations making use of a graphical model (Bayesian Network) and Big Data technologies. Results show that for pollution and weather variables the model tests well in sample but also has good predictive power when tested out of sample. This is facilitated by a training/testing split in the data along time and space dimension and suggests that the model generalizes well to new regions and time periods.
机译:通常通过尝试确定污染的主要原因及其对健康结果的最重要影响来探索污染与健康之间的联系。由于在理论上(即结果的解释)和技术上(即计算工作)的研究都涉及很大范围的场景,因此很少使用多元特征来描述暴露。在这项工作中,我们探索了使用贝叶斯网络和多元方法来识别环境与健康之间的概率依赖性结构。这包括环境因素(地形和气候),暴露水平(室外空气污染物的浓度)和健康结果(死亡率)。这些信息是针对一个数据丰富的研究区域进行整理的:英语区域(UK),其中包含了从城市到农村特征不同的环境类型。我们使用R编程语言实现了可重现的工作流,以整理环境健康数据并利用图形模型(贝叶斯网络)和大数据技术分析近5000万个观测值。结果表明,对于污染和天气变量,该模型在样品中测试得很好,但在样品外测试时也具有良好的预测能力。通过沿时间和空间维度对数据进行训练/测试拆分可以促进此过程,并表明该模型可以很好地推广到新的区域和时间段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号