首页> 外文会议>International Conference on Document Analysis and Recognition >Query by string word spotting based on character bi-gram indexing
【24h】

Query by string word spotting based on character bi-gram indexing

机译:基于字符Bi-Gram索引的字符串字识别查询

获取原文

摘要

In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representation that projects images and strings into a common attribute space based on a Pyramidal Histogram of Characters (PHOC). These attribute models are learned using linear SVMs over the Fisher Vector [8] representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi-gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets.
机译:在本文中,我们通过字符串字斑点方法提出了一种无分割的查询。使用最近提出的单词表示编码文档和查询字符串,该字符串基于字符(PHOC)的金字塔直方图将图像和字符串投射到公共属性空间中。这些属性模型在Fisher向量[8]上使用线性SVMS来学习图像的图像和相应字符串的PHOC标签。为了通过整个页面搜索,使用类似的属性表示,文档区域以每字符Bi-gr命中索引。首先,我们使用用于有效计算的属性模型的简化版本来提出文档的积分图像表示。最后,我们介绍了重新排名阶段,以提高检索性能。通过单个编写器和多写标准数据集,通过字符串单词斑点显示可免费查询的最先进的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号