首页> 外文会议>2018 17th IEEE International Conference on Trust, Security and Privacy In Computing and Communications, 12th IEEE International Conference on Big Data Science and Engineering >Effective Integration of Geotagged, Ancilliary Longitudinal Survey Datasets to Improve Adulthood Obesity Predictive Models
【24h】

Effective Integration of Geotagged, Ancilliary Longitudinal Survey Datasets to Improve Adulthood Obesity Predictive Models

机译:有效整合地理标记的,纵向的纵向调查数据集,以改善成年肥胖预测模型

获取原文
获取原文并翻译 | 示例

摘要

Obesity is a critical health issue world-wide and has been identified as a leading cause of chronic diseases such as cardiovascular disease, type-2 diabetes, stroke and certain types of cancer. In 2014, 1.9 billion adults were overweight and 600 million were obese. In this study, to facilitate early detection of childhood obesity, we present our methodology for effective data integration that allows modelers to import new attributes from auxiliary datasets using geospatial proximity, alongside the associated data uncertainty for each data point that is caused by the data aggregation process while estimating that attribute. We have used the data uncertainty estimate as input to various machine learning algorithms, to improve on the obesity prediction. As a case study, we have integrated the National Longitudinal Survey of Youth 1997 dataset with the US Census 2000 dataset and the 2000 CDC Growth Charts dataset to augment our prediction model with behavioral and environmental features. Compared to models with only biometric attributes, our empirical experiments show accuracy improvements when we incrementally consider behavioral aspects (8.9~10.2%), environmental aspects (12.1~12.3%) and data uncertainty estimates (18.3~25.6%).
机译:肥胖是世界范围内的重要健康问题,已被确定为慢性疾病的主要原因,例如心血管疾病,2型糖尿病,中风和某些类型的癌症。 2014年,有19亿成年人超重,6亿肥胖。在这项研究中,为了促进儿童肥胖的早期发现,我们提出了有效的数据集成方法,该方法允许建模人员使用地理空间邻近性从辅助数据集中导入新属性,以及由数据聚合引起的每个数据点的相关数据不确定性估计该属性时进行处理。我们已将数据不确定性估计值用作各种机器学习算法的输入,以改善肥胖预测。作为案例研究,我们将1997年全国青年纵向调查数据集与US Census 2000数据集和2000 CDC Growth Charts数据集进行了整合,以通过行为和环境特征来增强我们的预测模型。与仅具有生物特征属性的模型相比,我们的经验实验表明,当我们逐步考虑行为方面(8.9〜10.2%),环境方面(12.1〜12.3%)和数据不确定性估计值(18.3〜25.6%)时,准确性会有所提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号