首页> 外文会议>International conference on emerging trends in information technology >Hadoop Scalability and Performance Testing in Homogeneous Clusters

【24h】

Hadoop Scalability and Performance Testing in Homogeneous Clusters

机译：Hadoop可伸缩性和均匀集群中的性能测试

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Big data is a term used to refer to the datasets that are too large (Ex. GBs, TBs, PBs, ZBs, etc.) or complex for traditional data processing application software. Distributed and parallel processing becomes increasingly important for big data. There are two most popular parallel and distributed processing frameworks available, namely Hadoop and Spark. Hadoop and Spark are open-source software frameworks for reliable, scalable, and distributed computing. Hadoop is created by Apache Software Foundation. This framework allows the processing of extremely large datasets on clusters of computers using a simple programming model called MapReduce. It works on a distributed file system called HDFS (Hadoop Distributed File System) to ran on commodity hardware. It is designed to scale up horizontally from a single machine to thousands of machines, each offering local computation and storage. Performance of Hadoop cluster depends on the application and several parameters. In this paper we aim to study the performance of Hadoop homogeneous cluster by tuning a few parameters like cluster size, dataset size, and HDFS block size, etc.

机译：大数据是用于引用太大（例如GBS，TBS，PBS，ZB等）或复杂的数据集的术语，或者用于传统数据处理应用软件。分布式和并行处理对大数据变得越来越重要。有两个最受欢迎的并行和分布式处理框架可用，即Hadoop和Spark。 Hadoop和Spark是开源软件框架，可用于可靠，可扩展和分布式计算。 Hadoop是由Apache软件基础创建的。此框架允许使用称为MapReduce的简单编程模型处理计算机集群上的极大数据集。它适用于名为HDFS（Hadoop分布式文件系统）的分布式文件系统，以耗尽商品硬件。它旨在从单台机器水平扩展到数千台机器，每个机器都提供本地计算和存储。 Hadoop集群的性能取决于应用程序和几个参数。在本文中，我们的目的是通过调整聚类大小，数据集大小和HDFS块大小等一些参数来研究Hadoop同类集群的性能。

著录项

来源
《International conference on emerging trends in information technology 》|2020年|xxvii 1144 p.|共11页
会议地点
作者
Chiranjeevi Manike; Ashok Kumar Nanda; Tejashwini Gajulagudem;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术 ;
关键词
Big data; Hadoop cluster; Counters; HDFS block size; Performance; Scalability;

机译：大数据;Hadoop集群;计数器;HDFS块大小;性能;可扩展性;

相似文献

外文文献
中文文献
专利

1. Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster [J] . MATEC Web of Conferences . 2016 ,第1期

机译：基于Hadoop的大规模网络流量分析集群的性能评估
2. SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters [J] . Rong Gu, Xiaoliang Yang, Jinshuang Yan, Journal of Parallel and Distributed Computing . 2014 ,第3期

机译：SHadoop：通过优化Hadoop集群中的作业执行机制来提高MapReduce性能
3. MaxHadoop: An Efficient Scalable Emulation Tool to Test SDN Protocols in Emulated Hadoop Environments [J] . Claudio Calcaterra, Alessio Carmenini, Andrea Marotta, Journal of network and systems management . 2020 ,第4期

机译：maxhadoop：一个有效的可扩展仿真工具，用于测试模拟Hadoop环境中的SDN协议
4. Hadoop Scalability and Performance Testing in Homogeneous Clusters [C] . Chiranjeevi Manike, Ashok Kumar Nanda, Tejashwini Gajulagudem International conference on emerging trends in information technology . 2020

机译：Hadoop可伸缩性和均匀集群中的性能测试
5. Improving Performance of Hadoop Clusters [D] . Xie, Jiong 2011

机译：改善Hadoop集群的性能
6. Children Tested by the Point Scale and the Performance Scale [O] . Rudolph Pintner, Jeannette C. Reamer 1917

机译：通过分数量表和表现量表测试的孩子
7. Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster [O] . Ran Tao, Yuanyuan Qiao, Wenli Zhou 2016

机译：基于Hadoop的大规模网络流量分析群体的性能评估

Hadoop Scalability and Performance Testing in Homogeneous Clusters

摘要

著录项

相似文献

相关主题

期刊订阅