首页> 外文会议>International Conference on Big Data Analysis >Design and Implementation of a Scalable Distributed Web Crawler Based on Hadoop
【24h】

Design and Implementation of a Scalable Distributed Web Crawler Based on Hadoop

机译:基于Hadoop的可扩展分布式Web爬网履带的设计与实现

获取原文

摘要

In this article, an efficient and scalable distributed web crawler system based on Hadoop will be design and implement. In the paper, firstly the application of cloud computing in reptile field is introduced briefly, and then according to the current status of the crawler system, the specific use of Hadoop distributed and cloud computing features detailed design of a highly scalable crawler system, and finally the system Data statistics, under the same conditions, compared with the existing mature system, it is clear that the superiority of distributed web crawler. This advantage in the context of large data era of massive data is particularly important to climb.
机译:在本文中,基于Hadoop的高效且可扩展的分布式Web爬网履带系统将是设计和实现的。在本文中,首先,简要介绍了爬行动物领域在爬行动物领域的应用,然后根据履带系统的当前状态,特定使用Hadoop分布式和云计算具有高度可扩展的履带系统的详细设计,最后的设计系统数据统计数据统计,与现有的成熟系统相比,显然是分布式Web履带的优越性。在大数据时代的大规模数据的上下文中的这种优势尤为重要,才攀登。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号