Bag of Local Convolutional Triplets for Script Identification in Scene Text

机译：用于场景文本中脚本识别的局部卷积三胞胎袋

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The increasing interest in scene text reading in multilingual environments raises the need to recognize and distinguish between different writing systems. In this paper, we propose a novel method for script identification in scene text using triplets of local convolutional features in combination with the traditional bag-of-visual-words model. Feature triplets are created by making combinations of descriptors extracted from local patches of the input images using a convolutional neural network. This approach allows us to generate a more descriptive codeword dictionary for the bag-of-visual-words model, as the low discriminative power of weak descriptors is enhanced by other descriptors in a triplet. The proposed method is evaluated on two public benchmark datasets for scene text script identification and a public dataset for script identification in video captions. The experiments demonstrate that our method outperforms the baseline and yields competitive results on all three datasets.

机译：在多语言环境中，对场景文本阅读的兴趣日益浓厚，因此需要识别和区分不同的书写系统。在本文中，我们提出了一种使用局部卷积特征的三元组结合传统的视觉词袋模型在场景文本中进行脚本识别的新方法。通过使用卷积神经网络对从输入图像的局部补丁中提取的描述符进行组合来创建特征三元组。这种方法使我们能够为视觉词袋模型生成更具描述性的代码字字典，因为弱描述符的低判别能力会通过三元组中的其他描述符得到增强。在两个公共基准数据集（用于场景文本脚本识别）和一个公共数据集（用于在视频字幕中标识脚本）上评估了所提出的方法。实验表明，我们的方法优于基线，并且在所有三个数据集上均具有竞争性结果。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|369-375|共7页
会议地点
作者
Jan Zdenek; Hideki Nakayama;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Convolutional codes; Text recognition; Dictionaries; Training; Task analysis;

机译：特征提取;卷积码;文本识别;词典;训练;任务分析;

相似文献

外文文献
中文文献
专利

1. Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network [J] . Bhunia Ankan Kumar, Konwer Aishik, Bhunia Ayan Kumar, Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：使用基于Concutional-LSTM网络的注意力的自然场景图像和视频帧中的脚本识别
2. Residual attention-based multi-scale script identification in scene text images [J] . Ma Mengkai, Wang Qiu-Feng, Huang Shan, Neurocomputing . 2021,第Jana15期

机译：基于残余的关注的多尺度脚本识别在现场文本图像中
3. Multi-script text versus non-text classification of regions in scene images [J] . Sriman Bowornrat, Schomaker Lambert Journal of visual communication & image representation . 2019,第JULa期

机译：场景图像中区域的多脚本文本与非文本分类
4. Bag of Local Convolutional Triplets for Script Identification in Scene Text [C] . Jan Zdenek, Hideki Nakayama IAPR International Conference on Document Analysis and Recognition . 2017

机译：用于脚本识别的袋子本地卷积三胞胎
5. Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification. [D] . Zhu, Siyu. 2016

机译：具有卷积特征学习和级联分类的自然场景和技术图中的文本检测。
6. Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification [O] . Jianming Zhang, Chaoquan Lu, Jin Wang, 2020

机译：利用多尺寸图像和三重态损失训练卷积神经网络进行遥感场景分类
7. Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network [O] . Bhunia, Ankan Kumar, Konwer, Aishik, Bhowmick, Abir, 2018

机译：自然场景图像和视频帧中的脚本识别基于注意力的卷积LsTm网络

Bag of Local Convolutional Triplets for Script Identification in Scene Text

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅