首页> 外文期刊>International journal of computer science and network security >Intelligent Crawling On Open Web for Business Prospects
【24h】

Intelligent Crawling On Open Web for Business Prospects

机译:在开放Web上进行智能爬网以实现业务前景

获取原文
获取原文并翻译 | 示例
           

摘要

Dynamic nature of web based systems requires continuous system updating. Information retrieval depends upon crawlers that crawl the web exhaustively, but business corporates expect from their crawlers to retrieve the specific information as per their applications. Crawlers help to download the required information using hyperlinks that occur in Web pages but the information is usually partial & fails to fulfill user's aspirations. To retrieve updated information from one single link/url is very simple but if many urls give the same information, it becomes difficult to analyze which url/link is giving desired, sufficient, updated & up to date information. Moreover, it becomes difficult how to remove duplicate stories from same link domain. In the present paper attempt has been made to discuss the issues related to intelligent crawling by proposing various techniques to assist the scenario concerned with web mining for business prospects.
机译:基于Web的系统的动态性质要求不断进行系统更新。信息检索取决于对网络进行详尽爬网的爬网程序,但是商业公司希望爬网程序根据其应用程序检索特定信息。爬网程序使用Web页面中出现的超链接来帮助下载所需的信息,但是该信息通常是不完整的,无法满足用户的期望。从单个链接/ URL检索更新的信息非常简单,但是如果许多URL提供相同的信息,则很难分析哪个URL /链接提供了所需的,足够的,更新的和最新的信息。此外,如何从相同的链接域中删除重复的故事变得困难。在本文中,已尝试通过提出各种技术来协助与业务潜在客户进行Web挖掘有关的方案来讨论与智能爬网有关的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号