Apache Spark and Hadoop Based Big Data Processing System for Clinical Research

Sreekanth Rallapalli; Gondkar R. R.

首页> 外文期刊>International Journal of Applied Engineering Research >Apache Spark and Hadoop Based Big Data Processing System for Clinical Research

【24h】

Apache Spark and Hadoop Based Big Data Processing System for Clinical Research

机译：基于Apache Spark和Hadoop的临床研究大数据处理系统

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Usage of big data which is related to medical filed is gaining popularity among healthcare services and for clinical research. Medical field is one of the largest areas which is generating enormous amount and varieties of data. Traditional systems are incapable of handling such big data which is characterized by volume, variety, velocity, veracity and values (5 V's). To process this vast amount of data we need a framework which can parallel process the data by utilizing the clusters of commodity hardware. This hardware should be reliable, fault-tolerant. Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. In the Hadoop framework we can develop MapReduce applications which can scale up from single node to thousands of machines. This paper investigates the big data which is used in clinical research to find out the patients with similar patterns and recommend the patients who requires intensive care. Also, the patients can be informed about the future predictions. In this paper we propose a ten-node hadoop cluster to run the distributed mapreduce algorithms. This algorithm shows an efficient data processing with big clinical data. These results can be used to provide efficient and personalized decisions for the patients. The data sets used for the results purpose is taken from MIMIC-III an open source database which is one of the largest repositories of data.

机译：与医疗提交有关的大数据的使用是在医疗服务和临床研究中获得普及。医疗领域是最大的区域之一，它产生了巨大数量和各种数据。传统系统无法处理这些大数据，该数据具有体积，品种，速度，准确性和值（5 V'）。要处理此大量数据，我们需要一个框架，可以通过利用商品硬件集群并行处理数据。该硬件应可靠，容错容错。 Apache Spark是一个快速的内存数据处理引擎，具有优雅和富有富有富有的开发API，允许数据工作人员有效地执行需要快速迭代访问数据集的流，机器学习或SQL工作负载。在Hadoop框架中，我们可以开发MapReduce应用程序，该应用程序可以从单节点到数千台机器扩展。本文研究了临床研究中使用的大数据，以了解有类似模式的患者，并推荐需要重症监护的患者。此外，患者可以了解未来的预测。在本文中，我们提出了一个十个节点Hadoop集群来运行分布式MapReduce算法。该算法显示了具有大临床数据的有效数据处理。这些结果可用于为患者提供有效和个性化的决策。用于结果目的的数据集是从MIMIC-III的开源数据库中获取，该数据库是最大数据存储库之一。

著录项

来源
《International Journal of Applied Engineering Research》 |2018年第2期|共5页
作者
Sreekanth Rallapalli; Gondkar R. R.;
展开▼
作者单位

R&

D Centre Bharathiyar University;

CMR University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类工程基础科学;
关键词
Apache Spark; Big data; Hadoop; MapReduce; Machine learning;

机译：apache spark;大数据;hadoop;mapreduce;机器学习;

相似文献

外文文献
中文文献
专利

1. Apache Spark and Hadoop Based Big Data Processing System for Clinical Research [J] . Sreekanth Rallapalli, Gondkar R. R. International Journal of Applied Engineering Research . 2018,第10aPta2期

机译：基于Apache Spark和Hadoop的临床研究大数据处理系统
2. Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark [J] . Ilias Mavridis, Helen Karatza The Journal of Systems and Software . 2017,第Mara期

机译：使用Apache Hadoop和Apache Spark进行基于云的日志文件分析的性能评估
3. Typhoon quantitative rainfall prediction from big data analytics by using the apache hadoop spark parallel computing framework [J] . C- C. Wei, T.- H. Chou Oceanographic Literature Review . 2020,第10期

机译：台风通过使用Apache Hadoop火花并行计算框架来从大数据分析的量化降雨预测
4. On the usability of Hadoop MapReduce, Apache Spark Apache flink for data science [C] . Bilal Akil, Ying Zhou, Uwe Röhm IEEE International Conference on Big Data . 2017

机译：关于Hadoop MapReduce，Apache Spark和Apache flink在数据科学中的可用性
5. Streamlining Big Data Processing Pipelines via Unix Memory Tools, Persistent Spark Datasets, and the Apache Ignite Inmemory File System [D] . Blair, Walter 2018

机译：通过Unix内存工具，持久性Spark数据集和Apache Ignite内存文件系统简化大数据处理管道
6. Theoretical and Empirical Comparison of Big Data Image Processing with Apache Hadoop and Sun Grid Engine [O] . Shunxing Bao, Frederick D. Weitendorf, Andrew J. Plassard, -1

机译：使用Apache Hadoop和Sun Grid Engine进行大数据图像处理的理论和经验比较
7. PerTract: Model Extraction and Specification of Big Data Systems for Performance Prediction by the Example of Apache Spark and Hadoop [O] . Johannes Kroß, Helmut Krcmar 2019

机译：actract：Apache Spark和Hadoop示例的绩效预测模型提取和规范

Apache Spark and Hadoop Based Big Data Processing System for Clinical Research

摘要

著录项

相似文献

相关主题

期刊订阅