首页> 中文期刊>电脑知识与技术 >Hadoop整合Cassandra处理海量数据

Hadoop整合Cassandra处理海量数据

     

摘要

As a framework of distributed computing in open sourcing Apache organization, Hadoop can solve large scale access?ing of massive data efficiently, which can also cope with tens of millions of concurrency accessing from Internet. Unfortunately, Hadoop can’t support real-time reading, writing and modifying of the data. Furthermore, as a powerful key-value distributed database which faces to the columns, Cassandra has outstanding performance in real-time data reading, writing and scalability, but it lacks of the ability in analyzing and computing of massive data. Therefore, combining Hadoop with Cassandra can draw upon and benefit from each other to achieve a feasible solution in dealing with cloud computing problems. This paper, on the basis of the combination between Hadoop and Cassandra, discusses the necessity of the integration. Then, the specific integrating solution and implement was put forward. The summarizations of the problems during the integration were also be discussed.%Hadoop作为开源组织Apache的一个分布式计算开源框架,可高效的对海量数据进行运算和处理,可以应对互联网上数以千万计的并发处理和访问,但其不支持数据的实时读写和修改.Cassandra是一款面向列的功能强大的Key-Value分布式数据库系统,具有良好的实时读写性能和可扩展性,但缺乏对海量数据进行分析运算的能力.将Hadoop与Cassan?dra 结合起来,取长补短,就能为云计算模型的实施提供一个高效的切实可行的方案.该文首先阐述了Hadoop 整合Cas?sandra处理海量数据的必要性,然后提出了具体的整合方案和实现,最后总结了Hadoop整合Cassandra所遇到的主要问题.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号