【24h】

Ontology-Based Focused Crawler

机译:基于本体的爬虫

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This paper studies how to make the Focused Crawler collect the topical pages effectively and accurately. We analyze the inadequacy of the traditional methods and present a model used for the extraction of the feature vector. We present another model base on ontology to calculate the similarity between pages in semantic. Then we build a Focused Crawler that synthesizes the two models mentioned above and the Best-First in [1] strategy. We use the URL distribution strategy in [2] in the Focused Crawler. From the experiment's results we found that the two models are effective and accurate in feature vector extraction and similarity calculation. Therefore, the ontology-based focused crawler we present here is feasible in Focused Crawling.
机译:本文研究了如何使聚焦爬行器有效,准确地收集主题页面。我们分析了传统方法的不足,并提出了用于特征向量提取的模型。我们提出了另一个基于本体的模型,以计算语义上页面之间的相似度。然后,我们建立了一个聚焦爬虫,它综合了上述两个模型和[1]策略中的最佳优先。我们在[Focused Crawler]的[2]中使用URL分发策略。从实验结果中,我们发现这两个模型在特征向量提取和相似度计算中是有效且准确的。因此,我们在此介绍的基于本体的聚焦爬虫在聚焦爬虫中是可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号