【24h】

Scaling Up Data Integration and Analysis of Sensor Data for Pediatric Asthma

机译:扩大小儿哮喘的数据集成和传感器数据分析

获取原文

摘要

Childhood asthma is a serious chronic disease affecting 9.3 percent of the American pediatric population. Current epidemiological research tools are limited in the types of information they can collect and aggregate, which restricts their predictive power. The NIH Pediatric Research using Integrated Sensor Monitoring Systems (PRISMS) program is predicated on the idea that better, more accurate sensing of physiology, environmental exposures, and local context can greatly enhance scientific and clinical understanding towards more effective management of chronic disease. At the PRISMS Data and Software Coordination and Integration Center (DSCIC), we are developing a general data integration and analysis architecture, building upon Apache Kafka and Apache Spark, with several advantages. First, our system enables biomedical researchers to (1) collect streaming data from multiple sensors, including at-home and personal pollution monitors, environmental and weather sources, geospatial features and personal trajectories, and Ecological Momentary Assessments (EMA) data collected from subjects' mobile phones; (2) map these heterogeneous datasets to a common schema to facilitate analysis; and (3) train and apply statistical models over both streaming and historical data at scale. Our target applications include prediction algorithms for asthma exacerbations, identifying key personal-level triggers, and eventually providing closed-loop interventions. We will present an overview of the PRISMS-DSCIC data integration and analysis architecture, demonstrate a statistical analysis use case, and describe how our architecture enabled the development of a novel data mining approach using publicly available spatial data (OpenStreetMap) and meteorological data (Dark Sky) to build an air quality model for predicting the concentration of PM2.5 at a fine spatiotemporal resolution.
机译:儿童哮喘是一种严重的慢性疾病,影响了9.3%的美国儿科人口。当前的流行病学研究工具在可以收集和汇总的信息类型上受到限制,这限制了它们的预测能力。使用集成传感器监控系统(PRISMS)的NIH儿科研究基于这样的想法,即更好,更准确地感知生理,环境暴露和当地情况可以极大地增强科学和临床知识,以更有效地管理慢性病。在PRISMS数据和软件协调与集成中心(DSCIC),我们正在基于Apache Kafka和Apache Spark的基础上,开发具有多种优势的通用数据集成和分析架构。首先,我们的系统使生物医学研究人员能够(1)从多个传感器收集流数据,包括居家和个人污染监测仪,环境和天气源,地理空间特征和个人轨迹以及从受试者的身体中收集的生态矩评估(EMA)数据。手机; (2)将这些异构数据集映射到通用模式以方便分析; (3)大规模地在流数据和历史数据上训练和应用统计模型。我们的目标应用包括哮喘发作的预测算法,识别关键的个人水平触发因素并最终提供闭环干预。我们将概述PRISMS-DSCIC数据集成和分析体系结构,演示统计分析用例,并描述我们的体系结构如何使用公开的空间数据(OpenStreetMap)和气象数据(暗)来开发新颖的数据挖掘方法天空)建立一个空气质量模型,以精细的时空分辨率预测PM2.5的浓度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号