首页> 外国专利> Systems and methods for clustering of near-duplicate images in very large image collections

Systems and methods for clustering of near-duplicate images in very large image collections

机译:用于在非常大的图像集中对几乎重复的图像进行聚类的系统和方法

摘要

Detection of near-duplicate images is important for detecting the reuse of copyrighted material. Some applications require the clustering of near-duplicates instead of the comparison to an original. Representing images as bags of visual words is the first step for our clustering approach. An inverted index points from visual words to all the images containing that visual word. In the next step, matches are geometrically verified in pairs of images that share a large fraction of their visual words. Geometric verification may use affine, perspective, or other transformations. The verification step provides a similarity measure based on the fraction of the matching image points and on their distributions in the compared images. The resulting distance matrix is very sparse because most images in the collection are not compared to each other. This distance matrix is used as input for modified agglomerative hierarchical clustering approach that can handle a sparse distance matrix.
机译:检测近乎重复的图像对于检测受版权保护的材料的重用很重要。一些应用程序需要将近重复项进行聚类,而不是与原始副本进行比较。将图像表示为视觉单词袋是我们聚类方法的第一步。反向索引从视觉单词指向包含该视觉单词的所有图像。下一步,对匹配的图像进行几何验证,以成对的图像共享大部分视觉单词。几何验证可以使用仿射,透视图或其他变换。验证步骤基于匹配图像点的分数及其在比较图像中的分布来提供相似性度量。由于集合中的大多数图像没有相互比较,因此所得的距离矩阵非常稀疏。该距离矩阵用作可处理稀疏距离矩阵的改进的聚集层次聚类方法的输入。

著录项

  • 公开/公告号US10504002B2

    专利类型

  • 公开/公告日2019-12-10

    原文格式PDF

  • 申请/专利权人 FUJI XEROX CO. LTD.;

    申请/专利号US201715663815

  • 发明设计人 ANDREAS GIRGENSOHN;

    申请日2017-07-30

  • 分类号G06K9/62;G06K9/46;G06F16/55;

  • 国家 US

  • 入库时间 2022-08-21 11:23:50

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号