首页> 外文会议>IEEE International Conference on Big Data Computing Service and Applications >Automated Hot-Spot Identification for Spatial Investigation of Disease Indicators
【24h】

Automated Hot-Spot Identification for Spatial Investigation of Disease Indicators

机译:自动化热点识别,用于疾病指标的空间研究

获取原文

摘要

This paper presents a new procedure that uses spatial statistics to identify clusters of counties having either a high or low incidence of a disease (dependent variable). These counties provide a spatial snapshot that describes the disease in the study area. Using this spatial snapshot as a reference, the procedure evaluates potential factors (independent variables) sorted out by the degree of similarity with the disease when comparing spatial snapshots. The greater the similarity, the greater the likelihood for a causal relationship. Similarity also can facilitate the selection of variables to be considered rather than relying only on the researcher's expertise. In particular, the procedure is used to analyze Cardiovascular Disease at the county level for the contiguous 48 states using the Public Health Exposome, a data repository of environmental factors to which a given group of people may be exposed over the course of their lifetime and that may impact their health. The proposed procedure enables the analysis of a study area with a large number of regions, such as entire countries, but is able to go to the level of detail of a smaller area, such as a county. In contrast, researchers may limit their work to a small number of regions due to computational and analytical limitations. In addition, the procedure yields a ranking of independent variables according to their effect on the dependent variable. In the past Public Health researchers reported that analytical approaches required days of extremely complex statistics and computational time that restricted their analysis to 60 variables. The proposed procedure is run at the Texas Tech High Performance Computing Center taking 12 minutes for 168 variables and a study area with 3,028 regions.
机译:本文提出了一种新的程序,该程序使用空间统计数据来识别具有高或低疾病发生率(因变量)的县集群。这些县提供了描述研究区域疾病的空间快照。使用该空间快照作为参考,该过程将评估在与空间快照进行比较时按与疾病的相似程度分类的潜在因素(独立变量)。相似度越大,因果关系的可能性越大。相似性还可以促进选择要考虑的变量,而不是仅依赖研究人员的专业知识。尤其是,该程序用于通过公共卫生暴露调查来分析连续48个州的县级心血管疾病,这是一个环境因素的数据库,给定人群在其一生中可能会接触到这些环境因素,并且可能会影响他们的健康。所提出的程序可以对具有大量区域(例如整个国家)的研究区域进行分析,但可以深入到较小区域(例如县)的详细程度。相反,由于计算和分析的限制,研究人员可能会将他们的工作限制在少数几个区域。此外,该过程将根据自变量对因变量的影响对自变量进行排序。过去,公共卫生研究人员报告说,分析方法需要数天的极其复杂的统计信息和计算时间,从而将其分析限制在60个变量之内。拟议的程序在德克萨斯技术高性能计算中心运行,耗时12分钟处理168个变量,并在研究区域中包含3,028个区域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号