Design of focused crawler for information retrieval of Indian origin Academicians

机译：用于印度裔院士信息检索的集中式履带设计

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Search engines alleviate the task of finding information on Internet very easy. Web crawler is the main part of any search engine that follows the URLs to gather information from the Web. For topic specific crawling, a special type of crawler called focused Web crawler is used. Focused crawler tries to find high quality information on a specific topic while avoiding irrelevant links. In the era of contemporary world, boundary of countries has evanesces for researchers, scientists and academicians. In this paper, we have applied the concept of focused crawling for information retrieval of Indian origin academicians working abroad. The aim is to develop a database of such academicians working in universities abroad, finding, and connecting with them. Gathering all such individuals through manual search is an impossible task and hence this paper gives a design of a focused crawler that can conglomerate all such information. This continuously updating database will cater to students wishing to connect with their alumni or other professors for academic collaboration.

机译：搜索引擎非常轻松地减轻了在Internet上查找信息的任务。 Web搜寻器是任何遵循URL来从Web收集信息的搜索引擎的主要部分。对于特定于主题的爬网，使用一种称为“焦点Web爬网程序”的特殊爬网程序。重点突出的搜寻器会尝试查找有关特定主题的高质量信息，同时避免不相关的链接。在当代世界的时代，国家边界对研究人员，科学家和院士而言是逃避现实的。在本文中，我们将集中爬网的概念应用于在国外工作的印度血统院士的信息检索。目的是建立一个在国外大学工作的院士的数据库，查找并与他们建立联系。通过手动搜索来收集所有此类人员是一项不可能的任务，因此，本文提出了一种可以集中所有此类信息的集中式爬虫设计。这个不断更新的数据库将迎合希望与校友或其他教授进行学术合作的学生。

著录项

来源
《International Conference on Advances in Computing, Communication and Automation》|2016年|1-6|共6页
会议地点
作者
Manish Kumar; Rajesh Bhatia; Apoorva Ohri; Aditya Kohli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Crawlers; Databases; Data mining; Uniform resource locators; Internet; Search engines;

机译：爬网程序;数据库;数据挖掘;统一资源定位器;互联网;搜索引擎;

相似文献

外文文献
中文文献
专利

1. PDD Crawler : A Focused Web Crawler Using Link and Content Analysis for Relevence Prediction [J] . Prashant Dahiwale, M M Raghuwanshi, Latesh Malik Computer Science & Information Technology . 2014,第11期

机译：PDD爬网程序：使用链接和内容分析进行相关性预测的集中式Web爬网程序
2. ANTON Framework Based on Semantic Focused Crawler to Support Web Crime Mining Using SVM [J] . Javad Hosseinkhani, Hamed Taherdoost, Solmaz Keikhaee Annals of data science . 2021,第2期

机译：基于语义聚焦履带的Anton框架支持使用SVM的Web犯罪挖掘
3. An effective approach to enhancing a focused crawler using Google [J] . Lee Jae-Gil, Bae Donghwan, Kim Sansung, Journal of supercomputing . 2020,第10期

机译：使用谷歌加强聚焦履带的有效方法
4. Preliminary study on design and development of a journal focused crawler system using EBD methodology: Part I #x2014; Design task and environment analysis [C] . Wang Hansong, Wang Xiaoying, Wang Yixuan, Proceedings of the 2014 International Conference on Innovative Design and Manufacturing . 2014

机译：使用EBD方法设计和开发针对期刊的爬虫系统的初步研究：第一部分-设计任务和环境分析
5. Efficient Algorithms for Light Transmission, Focusing and Scattering Matrix Retrieval in Highly Diffusive 3D Random Media [D] . Guo, Han. 2018

机译：高度扩散3D随机介质中的光传输，聚焦和散射矩阵检索的高效算法
6. Quantitative evaluation of recall and precision of CAT Crawler a search engine specialized on retrieval of Critically Appraised Topics [O] . Peng Dong, Ling Ling Wong, Sarah Ng, 2004

机译：CAT Crawler的召回率和准确性的定量评估CAT Crawler是专门检索关键评估主题的搜索引擎
7. Design and implementation of the Web Crawler focusing on customizability and offering real-time stream data [O] . 打田研二 2012

机译：Web爬网程序的设计和实现侧重于可定制性并提供实时流数据

Design of focused crawler for information retrieval of Indian origin Academicians

摘要

著录项

相似文献

相关主题

期刊订阅