首页> 外文会议>International Conference on Advances in Computing, Communication and Automation >Design of focused crawler for information retrieval of Indian origin Academicians
【24h】

Design of focused crawler for information retrieval of Indian origin Academicians

机译:用于印度裔院士信息检索的集中式履带设计

获取原文

摘要

Search engines alleviate the task of finding information on Internet very easy. Web crawler is the main part of any search engine that follows the URLs to gather information from the Web. For topic specific crawling, a special type of crawler called focused Web crawler is used. Focused crawler tries to find high quality information on a specific topic while avoiding irrelevant links. In the era of contemporary world, boundary of countries has evanesces for researchers, scientists and academicians. In this paper, we have applied the concept of focused crawling for information retrieval of Indian origin academicians working abroad. The aim is to develop a database of such academicians working in universities abroad, finding, and connecting with them. Gathering all such individuals through manual search is an impossible task and hence this paper gives a design of a focused crawler that can conglomerate all such information. This continuously updating database will cater to students wishing to connect with their alumni or other professors for academic collaboration.
机译:搜索引擎非常轻松地减轻了在Internet上查找信息的任务。 Web搜寻器是任何遵循URL来从Web收集信息的搜索引擎的主要部分。对于特定于主题的爬网,使用一种称为“焦点Web爬网程序”的特殊爬网程序。重点突出的搜寻器会尝试查找有关特定主题的高质量信息,同时避免不相关的链接。在当代世界的时代,国家边界对研究人员,科学家和院士而言是逃避现实的。在本文中,我们将集中爬网的概念应用于在国外工作的印度血统院士的信息检索。目的是建立一个在国外大学工作的院士的数据库,查找并与他们建立联系。通过手动搜索来收集所有此类人员是一项不可能的任务,因此,本文提出了一种可以集中所有此类信息的集中式爬虫设计。这个不断更新的数据库将迎合希望与校友或其他教授进行学术合作的学生。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号