Focused Web Crawler

Dvijesh Bhatt; Daiwat Amit Vyas; Sharnil Pandya

首页> 外文期刊>Advances in Computer Science and Information Technology: ACSIT >Focused Web Crawler

【24h】

Focused Web Crawler

机译：专注的网爬行者

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the rapid development and increase in global data on World Wide Web and with increased and rapid growth in web users across the globe, an acute need has arisen to improve and modify or design search algorithms that helps in effectively and efficiently searching the specific required data from the huge repository available. Various search engines use different web crawlers for obtaining search results efficiently. Some search engines use focused web crawler that collects different web pages that usually satisfy some specific property, by effectively prioritizing the crawler frontier and managing the exploration process for hyperlink. A focused web crawler analyzes its crawl boundary to locate the links that are likely to be most relevant for the crawl, and avoids irrelevant regions of the web. This leads to significant savings in hardware and network resources, and helps keep the crawl more up-to-date.The process of focused web crawler is to nurture a collection set of web documents that are focused on some topical subspaces. It identifies the next most important and relevant link to follow by relying on probabilistic models for effectively predicting the relevancy of the document. Researchers across have proposed various algorithms for improving efficiency of focused web crawler. We try to investigate various types of crawlers with their pros and cons.Major focus area is focused web crawler. Future directions for improving efficiency of focused web crawler have been discussed. This will provide a base reference for anyone who wishes in researching or using concept of focused webcrawler in their research work that he/she wishes to carry out. The performance of a focused webcrawler depends on the richness of links in the specific topic being searched by the user, and it usually relies on a general web search engine for providing starting points for searching.

机译：随着全球网络的全球数据的快速发展和增加，全球网络用户的增加和快速增长，因此出现了改进和修改或设计搜索算法的急性需求，有效，有效地搜索特定的所需数据从巨大的存储库中可用。各种搜索引擎使用不同的Web爬网程序有效地获取搜索结果。一些搜索引擎使用聚焦的Web爬虫，通过有效地优先考虑爬网前沿并管理超链接的探索过程，收集通常满足某些特定属性的不同网页。聚焦的Web爬网程序分析其爬行边界，以定位可能对爬行最相关的链接，并避免网络的无关区域。这导致硬件和网络资源的大量节省，并有助于保持爬网更新。专注的Web爬网程序的过程是培育一组专注于某些局部子空间的Web文档集。它通过依赖概率模型来依赖于有效预测文档的相关性来识别下一个最重要的和相关的链接。跨越的研究人员提出了各种算法，以提高聚焦卷筒纸的效率。我们试图调查各种类型的爬行者，他们的利益和缺点是焦点焦点面积的重点是Web履带。已经讨论了提高专注的Web履带效率的未来方向。这将为任何愿望研究或使用聚焦WebCrawler的概念在他们的研究工作中的概念提供的任何人都提供基准参考。聚焦WebCrawler的性能取决于用户搜索的特定主题中的链路的丰富性，并且通常依赖于一般的网络搜索引擎来提供用于搜索的起点。

著录项

来源
《Advances in Computer Science and Information Technology: ACSIT》 |2015年第11期|共6页
作者
Dvijesh Bhatt; Daiwat Amit Vyas; Sharnil Pandya;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Survey about Algorithms Utilized by Focused Web Crawler [J] . Yong-Bin Yu, Shi-Lei Huang, Nyima Tashi, 电子科技学刊：英文版 . 2018,第02)期
2. A Multi-Threaded Semantic Focused Crawler [J] . Punam Bedi, Anjali Thukral, Hema Banati, 计算机科学技术学报（英文版） . 2012,第006期
3. To Whom Do Data Belong?——Data Ownership and Protection in the Context of Web-Crawlers [J] . Ding Xiaodong, Ryan(翻译) 当代社会科学（英文） . 2020,第006期
4. PDD Crawler : A Focused Web Crawler Using Link and Content Analysis for Relevence Prediction [J] . Prashant Dahiwale, M M Raghuwanshi, Latesh Malik Computer Science & Information Technology . 2014,第11期

机译：PDD爬网程序：使用链接和内容分析进行相关性预测的集中式Web爬网程序
5. Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources [J] . Cybernetics and information technologies: CIT . 2019,第2期

机译：优化的聚焦Web爬虫，基于自然语言处理的基于生物信息学网源的相关性测量
6. Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources [J] . S. R. Mani Sekhar, G. M. Siddesh, Sunilkumar S. Manvi, Cybernetics and information technologies: CIT . 2019,第2期

机译：优化的聚焦Web爬虫，基于自然语言处理的基于生物信息学网源的相关性测量
7. A Focused Crawler for Web Feature Service and Web Map Service Discovering [C] . Victor Macedo Alexandrino, Giovanni Comarela, Altigran Soares da Silva, International Symposiumin Web and Wireless Geograpical Information Systems . 2020

机译：用于Web功能服务和Web地图服务发现的一个聚焦履带
8. Constructing Web Crawlers for the World Art Dynamics Technology Platform [D] . Guo, Xueyuan. 2019

机译：为世界艺术动力学技术平台构建网络爬虫
9. A user-oriented web crawler for selectively acquiring online content in e-health research [O] . Songhua Xu, Hong-Jun Yoon, Georgia Tourassi -1

机译：面向用户的网络爬虫用于在电子卫生研究中选择性地获取在线内容
10. PDD CRAWLER: A FOCUSED WEB CRAWLER USING LINK AND CONTENT ANALYSIS FOR RELEVENCE PREDICTION [O] . Prashant Dahiwale, M M Raghuwanshi, Latesh Malik 2015

机译：pDD CRaWLER：使用链接和内容分析进行相关预测的聚焦网络爬虫
11. Enhancing a Web Crawler with Arabic Search Capability. [R] . Nguyen, Q. V. 2010

机译：使用阿拉伯语搜索功能增强Web爬网程序。

Focused Web Crawler

摘要

著录项

相似文献

相关主题

期刊订阅