首页> 外文会议>International Conference on Advanced Electronic Materials, Computers and Materials Engineering >Research and Design of Theme Image Crawler Based on Difference Hash Algorithm
【24h】

Research and Design of Theme Image Crawler Based on Difference Hash Algorithm

机译:基于差异哈希算法的主题图像爬虫的研究与设计

获取原文
获取外文期刊封面目录资料

摘要

For the problem of high repetition rate of image resources collected by general theme crawler, a theme image crawler system is designed to reduce image similarity. The main contents of the design include the main function modules of the crawler, the workflow of the system and the implementation method of the key modules. The difference hash algorithm is used to solve the problem of image similarity effectively. Combined with Web text cosine correlation algorithm and link PageRank algorithm, the paper comprehensively evaluates the relevance between Web resources and topics. The experimental results show that the subject image crawler can effectively reduce the similarity of the collected images and improve the efficiency of crawler image resources acquisition.
机译:对于通用主题履带器收集的图像资源的高重复率的问题,设计了一个主题图像履带系统以降低图像相似度。该设计的主要内容包括爬网的主要功能模块,系统的工作流程和关键模块的实现方法。差异哈希算法用于有效地解决图像相似度的问题。结合Web文本余弦相关算法和链接PageRank算法,本文全面评估了Web资源与主题之间的相关性。实验结果表明,对象图像履带可以有效地降低收集图像的相似性,提高履带图像资源获取的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号