Data-Parallel Web Crawling Models

机译：数据并行Web爬行模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The need to quickly locate, gather, and store the vast amount of material in the Web necessitates parallel computing. In this paper, we propose two models, based on multi-constraint graph-partitioning, for efficient data-parallel Web crawling. The models aim to balance the amount of data downloaded and stored by each processor as well as balancing the number of page requests. made by the processors. The models also minimize the total volume of communication during the link exchange between the processors. To evaluate the performance of the models, experimental results are presented on a sample Web repository containing around 915,000 pages.

机译：为了快速定位，收集和存储Web中的大量资料，需要并行计算。在本文中，我们提出了两个基于多约束图分区的模型，用于有效的数据并行Web爬网。这些模型旨在平衡每个处理器下载和存储的数据量，以及平衡页面请求的数量。由处理器制造。这些模型还将处理器之间的链路交换期间的通信总量最小化。为了评估模型的性能，在包含大约915,000页的示例Web存储库中展示了实验结果。

著录项

来源
《International Symposium on Computer and Information Sciences(ISCIS 2004); 20041027-29; Kemer-Antalya(TR)》|2004年|P.801-809|共9页
会议地点 Kemer-Antalya(TR)
作者
Berkant Barla Cambazoglu; Ata Turk; Cevdet Aykanat;
展开▼
作者单位

Department of Computer Engineering, Bilkent University 06800, Ankara, Turkey;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Novel web Crawling Technique with Supervised and Unsupervised Learning Models [J] . Jonathan Samuel, B.J. Jaidhan International Journal of Engineering Trends and Technology . 2016,第4期

机译：有监督和无监督学习模型的新型Web爬行技术
2. An M/M/1 Based Modeling Approach for the Web Crawled Data [J] . L. Rajesh, V. Shanthi, V. Lakshmi Narasimhan Indian Journal of Science and Technology . 2016,第35期

机译：基于M / M / 1的Web爬网数据建模方法
3. PROBABILISTIC MODELS FOR FOCUSED WEB CRAWLING [J] . HONGYU LIU, EVANGELOS MILIOS Computational Intelligence . 2012,第3期

机译：重点网页抓取的概率模型
4. Data-Parallel Web Crawling Models [C] . Berkant Barla Cambazoglu, Ata Turk, Cevdet Aykanat International Symposium on Computer and Information Sciences . 2004

机译：数据并行网络爬网模型
5. Crawling the Web: Discovery and maintenance of large-scale Web data. [D] . Cho, Junghoo. 2002

机译：爬行Web：发现和维护大规模Web数据。
6. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling [O] . R. Suganya Devi, D. Manjula, R. K. Siddharth 2015

机译：通过Web爬网中的超链接对大数据进行Web索引的一种有效方法
7. Data-parallel web crawling models [O] . Cambazoglu, B.B., Turk, A., Aykanat, C. 2004

机译：数据并行Web爬行模型

Data-Parallel Web Crawling Models

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅