Query by string word spotting based on character bi-gram indexing

机译：基于字符二元语法索引的字符串单词查找查询

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representation that projects images and strings into a common attribute space based on a Pyramidal Histogram of Characters (PHOC). These attribute models are learned using linear SVMs over the Fisher Vector [8] representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi-gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets.

机译：在本文中，我们提出了一种通过字符串词发现方法进行无分段查询的方法。使用最近提出的单词表示对文档和查询字符串进行编码，该单词表示基于金字塔形字符直方图（PHOC）将图像和字符串投影到公共属性空间中。这些属性模型是通过在图像的Fisher Vector [8]表示以及相应字符串的PHOC标签上使用线性SVM来学习的。为了搜索整个页面，使用相似的属性表示法按字符二元语法对文档区域进行索引。最重要的是，我们建议使用属性模型的简化版本对文档进行完整的图像表示，以进行有效的计算。最后，我们引入了重新排序步骤，以提高检索性能。我们通过单书写者和多书写者标准数据集中的字符串单词查找，显示了无分段查询的最新结果。

著录项

来源
《International Conference on Document Analysis and Recognition》|2015年|881-885|共5页
会议地点
作者
Ghosh Suman K.; Valveny Ernest;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
document image processing; image representation; image retrieval; image segmentation; indexing; support vector machines; PHOC label; character bi-gram indexing; document retrieval; fisher vector representation; integral image representation; linear SVM; multiwriter standard dataset; pyramidal histogram-of-character; segmentation-free query; string word spotting; word representation; Image segmentation;

机译：文档图像处理;图像表示;图像检索;图像分割;索引;支持向量机; PHOC标签;字符Bigram索引;文档检索; fisher矢量表示;整体图像表示;线性SVM;多写器标准数据集;金字塔直方图字符;无分段查询;字符串词识别;词表示;图像分割;

相似文献

外文文献
中文文献
专利

1. INDEXING AND QUERYING CHARACTER SETS IN ONE- AND TWO-DIMENSIONAL WORDS [J] . D. Belazzougui, R. Kolpakov, M. Raffinot Journal of Mathematical Sciences . 2018,第1期

机译：一维和二维单词的索引和查询字符集
2. Querying out-of-vocabulary words in lexicon-based keyword spotting [J] . Puigcerver Joan, Toselli Alejandro H., Vidal Enrique Neural computing & applications . 2017,第9期

机译：查询基于词汇的关键字斑点中的词汇单词
3. Word-Location Based Indexing Using Sequential Pattern in Deep Web Mining for Query Processing [J] . Meimoon Ibrahim, H. Mursalim Umar Gani, H. Bahar Sinring, Australian Journal of Basic and Applied Sciences . 2015,第2015期

机译：在深度Web挖掘中使用顺序模式的基于单词位置的索引进行查询处理
4. Query by string word spotting based on character bi-gram indexing [C] . Ghosh Suman K., Valveny Ernest International Conference on Document Analysis and Recognition . 2015

机译：基于字符Bi-Gram索引的字符串字识别查询
5. M-Grid: A distributed framework for multidimensional indexing and querying of location based Big Data. [D] . Kumar, Shashank. 2014

机译：M-Grid：一种分布式框架，用于基于位置的大数据进行多维索引和查询。
6. SAM: String-based sequence search algorithm for mitochondrial DNA database queries [O] . Alexander Röck, Jodi Irwin, Arne Dür, -1

机译：SAM：用于线粒体DNA数据库查询的基于字符串的序列搜索算法
7. Query by String word spotting based on character bi-gram indexing [O] . Ghosh, Suman K., Valveny, Ernest 2015

机译：基于字符二元索引的字符串定位查询

Query by string word spotting based on character bi-gram indexing

摘要

著录项

相似文献

相关主题

期刊订阅