首页> 外国专利> WEB CRAWLING INITIAL POINT SELECTION SYSTEM, METHOD, AND PROGRAM

WEB CRAWLING INITIAL POINT SELECTION SYSTEM, METHOD, AND PROGRAM

机译:网页抓取初始点选择系统,方法和程序

摘要

A graph construction means calculates the weight of web data according to the extent in which the web data matches the information associated with a designated category, and constructs a weighted directed graph which is a graph including the weight of the web data and the directed link between the web data. An initial point selection means selects the web data with the highest score on the basis of a rule in which, with reference to the weighted directed graph, the higher the weight between a web data and another linked web data, the higher the score of the latter is calculated. A crawling depth determination means determines the depth from the initial point in which the web data is crawled on the basis of a rule in which, with reference to the weighted directed graph, the score is calculated lower as the number of web data at a depth from the initial point increases.
机译:图构造装置根据网络数据与指定类别相关联的信息匹配的程度来计算网络数据的权重,并构造加权有向图,该加权有向图是包括网络数据的权重和之间的有向链接的图网络数据。初始点选择装置基于以下规则选择得分最高的Web数据:在该规则中,参照加权有向图,Web数据和另一个链接的Web数据之间的权重越高,则该Web数据的得分越高。后者是计算出来的。爬取深度确定装置根据规则,从该点开始爬取Web数据的深度,该规则是参照加权有向图将分数计算为较低的深度的Web数据的数量从最初的点开始增加。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号