Performance Considerations of Data Acquisition in Hadoop System

机译：Hadoop系统中数据采集的性能注意事项

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data have become more and more important these years, especially for big companies, and it is of great benefit to mine useful information in these data. Oil & Gas industry has to deal with vast amounts of data, both in real-time and historical context. As the amount of data is significant, it is usually infeasible or very time consuming to actually process the data. In our project we investigate usage of Hadoop to solve this problem. In order to perform Hadoop jobs, data must first exist in the Hadoop file system, which creates the problem of data acquisition. In this paper, two solutions are investigates, performance comparison is performed and solution based on Chukwa is demonstrated to be more efficient than a naïve implementation in particular for bigger file sizes.

机译：这些年来，数据变得越来越重要，尤其是对于大公司而言，在这些数据中挖掘有用的信息将带来极大的好处。石油和天然气行业必须在实时和历史背景下处理大量数据。由于数据量很大，因此实际处理数据通常不可行或非常耗时。在我们的项目中，我们研究了使用Hadoop解决此问题的方法。为了执行Hadoop作业，数据必须首先存在于Hadoop文件系统中，这会导致数据获取问题。本文研究了两种解决方案，进行了性能比较，并证明了基于Chukwa的解决方案比单纯的实现更有效，特别是对于较大的文件大小。

著录项

来源
《2nd IEEE International Conference on Cloud Computing Technology and Science》|2010年|p.545-549|共5页
会议地点
作者
Jia Baodong; Wlodarczyk Tomasz Wiktor; Rong Chunming;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Chukwa; Data Acquisition; Hadoop; Historical data; Performance; Real-time data;

机译：Chukwa;数据采集; Hadoop;历史数据;性能;实时数据;
入库时间 2022-08-26 15:00:51

相似文献

外文文献
中文文献
专利

1. Big Data Performance Analysis on a Hadoop Distributed File System Based on Geometric Data Perturbation Technique [J] . V. Santhana Marichamy, V. Natarajan Procedia Computer Science . 2019,第5期

机译：基于几何数据扰动技术的Hadoop分布式文件系统大数据性能分析
2. H2Hadoop: Improving Hadoop Performance Using Metadata of Related Jobs [J] . S.Kanaka Lakshmi, K.Ramachandra Rao International Journal of Computer Science and Technology . 2017,第4a1期

机译：H2Hadoop：使用相关作业的元数据提高Hadoop性能
3. Software-Defined Networking for Scalable Cloud-based Services to Improve System Performance of Hadoop-based Big Data Applications [J] . Desta Haileselassie Hagos International journal of grid and high performance computing . 2016,第2期

机译：用于可扩展的基于云的服务的软件定义网络，以提高基于Hadoop的大数据应用程序的系统性能
4. Performance Considerations of Data Acquisition in Hadoop System [C] . Jia Baodong, Wlodarczyk Tomasz Wiktor, Rong Chunming IEEE International Conference on Cloud Computing Technology and Science . 2010

机译：Hadoop系统中数据采集的绩效考虑
5. Improving Hadoop performance by using metadata of related jobs in text datasets via enhancing MapReduce workflow. [D] . Alshammari, Hamoud. 2016

机译：通过增强MapReduce工作流程，在文本数据集中使用相关作业的元数据来提高Hadoop性能。
6. Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce [O] . Ablimit Aji, Fusheng Wang, Hoang Vo, -1

机译：Hadoop-GIS：基于MapReduce的高性能空间数据仓库系统
7. Data acquisition in hadoop system [O] . Jia Baodong 2010

机译：hadoop系统中的数据采集

Performance Considerations of Data Acquisition in Hadoop System

摘要

著录项

相似文献

相关主题

期刊订阅