首页> 外文OA文献 >Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction
【2h】

Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction

机译:计算资源敏感的神经网络并行化,用于大规模糖尿病数据建模,诊断和预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.
机译:由于全球越来越多的糖尿病患者,糖尿病已成为最严重的疾病之一。通过各种渠道收集了大量有关糖尿病的数字数据。如何利用这些数据集来帮助医生对糖尿病患者的诊断,治疗和预测做出决定,这给研究界带来了许多挑战。本文利用网格计算和云计算等现代计算技术,针对神经网络进行大规模糖尿病数据建模和分析的数学模型进行了研究。这些计算技术为用户提供了一种廉价的方式,可以通过Internet访问广泛的计算资源来解决数据和计算密集型问题。本文评估了七种代表性的机器学习技术在糖尿病数据分类中的性能,结果表明神经网络分类的准确性最高,但在数据训练中却会产生高昂的开销。因此,本文开发了基于MapReduce编程模型的并行神经网络模型MRNN,该模型已成为支持云中数据密集型应用程序的使能技术。通过将糖尿病数据集划分为多个大小相等的数据块,可以将训练中的工作负载分配到多个计算节点之间,以加快数据训练的速度。 MRNN首先在小规模的实验环境中使用12个映射器进行评估,随后在大规模的模拟环境中使用多达1000个映射器进行评估。实验和仿真结果均显示了MRNN在分类中的有效性,以及其在数据训练中的高可伸缩性。对于其中计算节点可能具有不同计算能力的异类计算环境,MapReduce没有复杂的作业调度方案。为此,本文提出了一种基于遗传算法的负载均衡方案,旨在平衡异构计算节点之间的训练工作量。具有更多计算能力的节点将收到更多MapReduce作业以供执行。利用可分负载理论指导遗传算法的进化过程,以期实现快速收敛。拟议的负载平衡方案是在使用不同大小的数据集,具有不同异构性级别的大规模模拟MapReduce环境中进行评估的。所有结果表明,与不使用负载平衡所花费的时间相比,基于遗传算法的负载平衡方案显着减少了作业执行的完成时间。

著录项

  • 作者

    Li M; Qi Hao;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 English
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号