首页> 外文会议>International Conference on Image Information Processing >News Data Analysis from Facebook Through MongoDB and Hive
【24h】

News Data Analysis from Facebook Through MongoDB and Hive

机译:来自Facebook通过MongoDB和Hive的新闻数据分析

获取原文

摘要

The research aims at doing a comparative study for news data analysis through MongoDB and Hive. The news posts or feeds are gathered from the official Facebook page of The Times of India with the help of Facebook Graph API. The data is stored in a NoSQL database, MongoDB and the Hive data warehouse of Hadoop ecosystem. For better data handling and quick query processing, the data is sharded (partitioned) in MongoDB as well as partitioned in Hive using the date field as the sharding key and the partitioning key in the respective databases. A comparative study is done between the two platforms by running various queries on the data and comparing the execution time. The queries aim to search the news posts on the basis of date ranges, particular keywords in the heading or through any other field like maximum laugh reactions. The study was done on a single machine system for both MongoDB and Hive. The results indicate less execution time for all queries with MongoDB as compared to Hive.
机译:该研究旨在通过MongoDB和Hive进行新闻数据分析的比较研究。在Facebook Graph API的帮助下,新闻帖子或饲料从印度时代的官方Facebook页面收集。数据存储在Hadoop生态系统的NoSQL数据库,MongoDB和Hive数据仓库中。为了更好的数据处理和快速查询处理,数据在MongoDB中分离(分区),以及使用日期字段作为分片键和各个数据库中的分区密钥分区。通过在数据上运行各种查询并进行比较执行时间来在两个平台之间进行比较研究。查询旨在根据日期范围搜索新闻帖子,标题中的特定关键字或通过任何其他领域,如最大笑声。该研究是在单一机器系统上进行的,适用于MongoDB和Hive。结果表明与蜂巢相比,MongoDB的所有查询的执行时间较少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号