Query by string word spotting based on character bi-gram indexing

机译：基于字符Bi-Gram索引的字符串字识别查询

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representation that projects images and strings into a common attribute space based on a Pyramidal Histogram of Characters (PHOC). These attribute models are learned using linear SVMs over the Fisher Vector [8] representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi-gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets.

机译：在本文中，我们通过字符串字斑点方法提出了一种无分割的查询。使用最近提出的单词表示编码文档和查询字符串，该字符串基于字符（PHOC）的金字塔直方图将图像和字符串投射到公共属性空间中。这些属性模型在Fisher向量[8]上使用线性SVMS来学习图像的图像和相应字符串的PHOC标签。为了通过整个页面搜索，使用类似的属性表示，文档区域以每字符Bi-gr命中索引。首先，我们使用用于有效计算的属性模型的简化版本来提出文档的积分图像表示。最后，我们介绍了重新排名阶段，以提高检索性能。通过单个编写器和多写标准数据集，通过字符串单词斑点显示可免费查询的最先进的结果。

著录项

来源
《International Conference on Document Analysis and Recognition》|2015年||共5页
会议地点
作者
Ghosh Suman K.; Valveny Ernest;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
document image processing; image representation; image retrieval; image segmentation; indexing; support vector machines; PHOC label; character bi-gram indexing; document retrieval; fisher vector representation; integral image representation; linear SVM; multiwriter standard dataset; pyramidal histogram-of-character; segmentation-free query; string word spotting; word representation; Image segmentation;

机译：文档图像处理;图像检索;图像分割;索引;支持向量机;PHOC标签;字符双克索引;文件检索;Fisher矢量表示;积分图像表示;线性SVM;多错标准数据集;金字塔型直方图 -character;没有分割查询;字符串字斑点;字表示;图像分割;

相似文献

外文文献
中文文献
专利

1. INDEXING AND QUERYING CHARACTER SETS IN ONE- AND TWO-DIMENSIONAL WORDS [J] . D. Belazzougui, R. Kolpakov, M. Raffinot Journal of Mathematical Sciences . 2018,第1期

机译：一维和二维单词的索引和查询字符集
2. Querying out-of-vocabulary words in lexicon-based keyword spotting [J] . Puigcerver Joan, Toselli Alejandro H., Vidal Enrique Neural computing & applications . 2017,第9期

机译：查询基于词汇的关键字斑点中的词汇单词
3. Word-Location Based Indexing Using Sequential Pattern in Deep Web Mining for Query Processing [J] . Meimoon Ibrahim, H. Mursalim Umar Gani, H. Bahar Sinring, Australian Journal of Basic and Applied Sciences . 2015,第2015期

机译：在深度Web挖掘中使用顺序模式的基于单词位置的索引进行查询处理
4. Query by string word spotting based on character bi-gram indexing [C] . Ghosh Suman K., Valveny Ernest International Conference on Document Analysis and Recognition . 2015

机译：基于字符二元语法索引的字符串单词查找查询
5. M-Grid: A distributed framework for multidimensional indexing and querying of location based Big Data. [D] . Kumar, Shashank. 2014

机译：M-Grid：一种分布式框架，用于基于位置的大数据进行多维索引和查询。
6. SAM: String-based sequence search algorithm for mitochondrial DNA database queries [O] . Alexander Röck, Jodi Irwin, Arne Dür, -1

机译：SAM：用于线粒体DNA数据库查询的基于字符串的序列搜索算法
7. Query by String word spotting based on character bi-gram indexing [O] . Ghosh, Suman K., Valveny, Ernest 2015

机译：基于字符二元索引的字符串定位查询

Query by string word spotting based on character bi-gram indexing

摘要

著录项

相似文献

相关主题

期刊订阅