首页> 外文会议>Proceedings of the Sixth Symposium on Operating Systems Design and Implementation(OSDI'04) >MapReduce: Simplified Data Processing on Large Clusters

【24h】

MapReduce: Simplified Data Processing on Large Clusters

机译：MapReduce：大型集群上的简化数据处理

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper.rnPrograms written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.rnOur implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

机译：MapReduce是用于处理和生成大型数据集的编程模型和相关的实现。用户指定一个处理键/值对以生成一组中间键/值对的映射函数，以及一个归约合并与同一中间键关联的所有中间值的reduce函数。如该论文所示，该模型可表达许多现实世界中的任务。以这种功能风格编写的程序会自动并行化并在大型商用机器集群上执行。运行时系统负责划分输入数据，安排程序在一组机器上的执行，处理机器故障以及管理所需的机器间通信的细节。这使没有并行和分布式系统经验的程序员可以轻松利用大型分布式系统的资源。rn我们的MapReduce实现可在大型商用机器集群上运行，并且具有高度可扩展性：典型的MapReduce计算可处理成千上万兆字节的数据机器。程序员发现该系统易于使用：每天执行数百个MapReduce程序，每天在Google的集群上执行多达一千个MapReduce作业。

著录项

来源
《Proceedings of the Sixth Symposium on Operating Systems Design and Implementation(OSDI'04) 》|2004年|137-149|共13页
会议地点 San FranciscoCA(US)
作者
Jeffrey Dean; Sanjay Ghemawat;
展开▼
作者单位

Google, Inc.;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. MapReduce: Simplified Data Processing on Large Clusters [J] . Jeffrey Dean, Sanjay Ghemawat Communications of the ACM . 2008 ,第1期

机译：MapReduce：大型集群上的简化数据处理
2. Hierarchical MapReduce: towards simplified cross-domain data processing [J] . Yuan Luo, Beth Plale, Zhenhua Guo, Concurrency and computation: practice and experience . 2014 ,第4期

机译：分层MapReduce：面向简化的跨域数据处理
3. Simplifying MapReduce data processing [J] . Chih-Shan Liao, Jin-Ming Shih, Ruay-Shiung Chang International Journal of Computational Science and Engineering . 2013 ,第3期

机译：简化MapReduce数据处理
4. MapReduce: Simplified Data Processing on Large Clusters [C] . Jeffrey Dean, Sanjay Ghemawat Proceedings of the Sixth Symposium on Operating Systems Design and Implementation(OSDI'04) . 2004

机译：MapReduce：大型集群上的简化数据处理
5. Data intensive query processing for Semantic Web data using Hadoop and MapReduce. [D] . Husain, Mohammad Farhan. 2011

机译：使用Hadoop和MapReduce对语义Web数据进行数据密集型查询处理。
6. Handling Data Skew in MapReduce Cluster by Using Partition Tuning [O] . Yufei Gao, Yanjie Zhou, Bing Zhou, 2017

机译：使用分区调整处理MapReduce群集中的数据偏斜
7. A STUDY ON MAPREDUCE FOR SIMPLIFIED PROCESSING OF BIG DATA [O] . 2016

机译：大数据简化处理的MapReduce研究
8. Interactive Query Processing in Big Data Systems: A Cross Industry Study of MapReduce Workloads. [R] . R. H. Katz S. Alspaugh Y. Chen 2012

机译：大数据系统中的交互式查询处理：mapReduce工作负载的跨行业研究。

MapReduce: Simplified Data Processing on Large Clusters

摘要

著录项

相似文献

相关主题

期刊订阅