Distributed High-Performance Web Crawler Based on Peer-to-Peer Network

机译：基于对等网络的分布式高性能Web爬虫

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Distributing the crawling activity among multiple machines can distribute processing to reduce the analysis of web page. This paper presents the design of a distributed web crawler based on Peer-to-Peer network. The distributed crawler harnesses the excess bandwidth and computing resources of nodes in system to crawl the web. Each crawler is deployed in a computing node of P2P to analyze web page and generate indices. Control node is another node to being in charge of distributing URLs to balance the load of the crawler. Control nodes are organized as P2P network. The crawler nodes managed by the same control node is a group. According to the ID of crawler and average load of the group, crawler can decide whether transmits the URL to control node or hold itself. We present an implementation of the distributed crawler based on Igloo and simulate the environment to evaluate the balancing load on the crawlers and crawl speed.

机译：在多台计算机之间分配爬网活动可以分配处理以减少对网页的分析。本文提出了一种基于对等网络的分布式网络爬虫的设计。分布式搜寻器利用系统中节点的多余带宽和计算资源来搜寻Web。每个搜寻器都部署在P2P的计算节点中，以分析网页并生成索引。控制节点是另一个负责分发URL来平衡搜寻器负载的节点。控制节点被组织为P2P网络。由同一控制节点管理的搜寻器节点是一个组。根据爬虫的ID和组的平均负载，爬虫可以决定是将URL传输到控制节点还是保留自身。我们提出了一种基于Igloo的分布式爬虫的实现，并模拟了环境以评估爬虫的平衡负载和爬网速度。

著录项

来源
《International Conference on Parallel and Distributed Computing: Applications and Technologies(PDCAT 2004); 20041208-10; Singapore(SG)》|2004年|P.50-53|共4页
会议地点 Singapore(SG)
作者
Liu Fei; Ma Fan-Yuan; Ye Yun-Ming; Li Ming-Lu; Yu Jia-Di;
展开▼
作者单位

Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai, P. R. China 200030;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词

相似文献

外文文献
中文文献
专利

1. Measuring Peer-to-Peer Network Topology through Geo-Location-Aware Distributed Crawlers [J] . Pratama PUTRA, Akihiro NAKAO 電子情報通信学会技術研究報告 . 2009,第228期

机译：通过地理位置感知的分布式爬网程序测量对等网络拓扑
2. Measuring Peer-to-Peer Network Topology through Geo-Location-Aware Distributed Crawlers [J] . Pratama PUTRA, Akihiro NAKAO 電子情報通信学会技術研究報告. ネットワ-クシステム. Network Systems . 2009,第228期

机译：通过地理位置感知的分布式爬网程序测量对等网络拓扑
3. Measuring Peer-to-Peer Network Topology through Geo-Location-Aware Distributed Crawlers [J] . Pratama PUTRA, Akihiro NAKAO 電子情報通信学会技術研究報告. ネットワ-クシステム. Network Systems . 2009,第228期

机译：通过地理位置感知分布式爬虫测量点对点网络拓扑
4. Distributed High-Performance Web Crawler Based on Peer-to-Peer Network [C] . Liu Fei, Ma Fan-Yuan, Ye Yun-Ming, International Conference on Parallel and Distributed Computing: Applications and Technologies . 2004

机译：基于对等网络的分布式高性能Web爬网
5. A high-performance messaging system for peer-to-peer networks. [D] . Junginger, Markus Oliver. 2003

机译：用于对等网络的高性能消息传递系统。
6. Distributed Peer-to-Peer Target Tracking in Wireless Sensor Networks [O] . Xue Wang, Sheng Wang, Dao-Wei Bi, 2007

机译：无线传感器网络中的分布式对等目标跟踪
7. Design and Implementation of a High-Performance Distributed Web Crawler [O] . Vladislav Shkapenyuk, Torsten Suel 2001

机译：高性能分布式Web爬虫的设计与实现
8. Distributed design tools: Mapping targeted design tools onto a Web-based distributed architecture for high-performance computing [R] . Holmes, V. P. , Linebarger, J. M. , Miller, D. J. , 1999

机译：分布式设计工具：将目标设计工具映射到基于Web的分布式架构，以实现高性能计算

Distributed High-Performance Web Crawler Based on Peer-to-Peer Network

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅