首页> 外国专利> Directed web crawler with machine learning

Directed web crawler with machine learning

机译:定向网络爬虫与机器学习

摘要

A web crawler identifies and characterizes an expression of a topic of general interest (such as cryptography) entered and generates an affinity set which comprises a set of related words. This affinity set is related to the expression of a topic of general interest. Using a common search engine, seed documents are found. The seed documents along with the affinity set and other search data will provide training to a classifier to create classifier output for the web crawler to search the web based on multiple criteria, including a content-based rating provided by the trained classifier. The web crawler can perform it's search topic focused, rather than “link” focused. The found relevant content will be ranked and results displayed or saved for a specialty search.
机译:网络搜寻器识别并表征输入的普遍关注的主题(例如密码学)的表达,并生成包含一组相关单词的相似性集。此相似性集与普遍关注的主题的表达有关。使用通用的搜索引擎,可以找到种子文档。种子文档以及亲和性集和其他搜索数据将为分类器提供培训,以创建分类器输出,以供网络爬虫基于多个标准(包括受过训练的分类器提供的基于内容的评分)搜索网络。网络搜寻器可以执行以搜索主题为中心的功能,而不是“链接”专注。找到的相关内容将进行排名,结果将显示或保存以进行专业搜索。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号