【24h】

The Design and Implementation of a High-Efficiency Distributed Web Crawler

机译:高效分布式网络爬虫的设计与实现

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

With the rapid development of the Internet, the amount of data on the Internet become more and more huge, and the website technology is constantly changing. Faced with the huge and complex data on the global Internet, how to crawl and use this information has become a major challenge. Traditional stand-alone web crawler is difficult to cope with the challenges brought by the rapid growth of information, and it is difficult to grab huge amounts of data quickly and effectively. In this paper, we research to use the distributed technology to design and implement an efficient, configurable, load balancing and scalable distributed web crawler system.
机译:随着Internet的飞速发展,Internet上的数据量越来越大,网站技术也在不断变化。面对全球互联网上庞大而复杂的数据,如何抓取和使用这些信息已成为一项重大挑战。传统的独立Web爬网程序难以应对信息快速增长所带来的挑战,并且难以快速有效地获取大量数据。在本文中,我们研究使用分布式技术来设计和实现高效,可配置,负载平衡和可扩展的分布式Web爬网程序系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号