...
首页> 外文期刊>Cluster computing >A Gaussian process based big data processing framework in cluster computing environment
【24h】

A Gaussian process based big data processing framework in cluster computing environment

机译:基于高斯过程基于群集计算环境的大数据处理框架

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Machine learning algorithms play a vital role in the prediction of an outbreak of diseases based on climate change. Dengue outbreak is caused by improper maintenance of water storages, lack of urbanization, deforestation, lack of vaccination and awareness. Moreover, a number of dengue cases are varying based on climate season. There is a need to develop the prediction model for modeling the dengue outbreak based climate change. To model the dengue outbreak, Gaussian process regression (GPR) model is applied in this paper that uses the seasonal average of various climate parameters such as maximum temperature, minimum temperature, precipitation, wind, relative humidity and solar. The number of dengue cases and climate data for each block of Tamil Nadu, India are collected from Integrated Disease Surveillance Project and Global Weather Data for SWAT Inc respectively. Local Moran's I spatial autocorrelation is used in this paper for geographical visualization of hotspot regions. The outbreak of dengue and its hot spot regions are geographically visualized with the help of ArcGIS 10.1 software. The day wise big climate data is collected and stored in the Hadoop cluster computing environment. MapReduce framework is used to reduce the day wise climate data into seasonal climate averages such as winter, summer, and monsoon. The seasonal climate data and number of dengue incidence (health data) are integrated based on the geolocation (latitude and longitude). GPR is used to develop the prediction model for dengue based on the integrated data (climate and health data). The proposed Gaussian process based prediction model is compared with various machine learning approaches such as multiple regression, support vector machine and random forests. Experimental results demonstrate the effectiveness of our Gaussian process based prediction framework.
机译:机器学习算法在基于气候变化的疾病爆发中起着至关重要的作用。登革热爆发是由于水资源存放的维护不当,缺乏城市化,森林砍伐,缺乏疫苗接种和意识引起的。此外,许多登革热病例基于气候季节变化。有必要开发用于建模基于登革热爆发的气候变化的预测模型。为了模拟登革热爆发,在本文中应用高斯过程回归(GPR)模型,该论文应用了各种气候参数的季节性平均值,例如最高温度,最低温度,降水,风,相对湿度和太阳能。印度每个泰米尔纳德邦的每个块的登革热病例和气候数据分别都分别从综合疾病监测项目和全球天气数据中收集了SWAT股份有限公司。本文使用了本地莫兰的I空间自相关,以获得热点区域的地理形象化。在ArcGIS 10.1软件的帮助下,登革热爆发及其热点区域在地理上可视化。将日期大的气候数据收集并存储在Hadoop集群计算环境中。 MapReduce框架用于将日常风险数据减少到冬季,夏季和季风等季节性气候数据中。季节性气候数据和登革热入射(健康数据)的数量基于地理定位(纬度和经度)整合。基于集成数据(气候和健康数据),GPR用于开发登革热的预测模型。将所提出的高斯过程基于Gaussian进程的预测模型与多元回归等各种机器学习方法进行比较,支持向量机和随机林。实验结果表明了基于高斯过程的预测框架的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号