Scene Text Image Super-Resolution in the Wild

机译：场面文本图像超分辨率在野外

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Low-resolution text images are often seen in natural scenes such as documents captured by mobile phones. Recognizing low-resolution text images is challenging because they lose detailed content information, leading to poor recognition accuracy. An intuitive solution is to introduce super-resolution (SR) techniques as pre-processing. However, previous single image super-resolution (SISR) methods are trained on synthetic low-resolution images (e.g. Bicubic down-sampling), which is simple and not suitable for real low-resolution text recognition. To this end, we propose a real scene text SR dataset, termed TextZoom. It contains paired real low-resolution and high-resolution images which are captured by cameras with different focal length in the wild. It is more authentic and challenging than synthetic data, as shown in Fig. 1. We argue improving the recognition accuracy is the ultimate goal for Scene Text SR. In this purpose, a new Text Super-Resolution Network, termed TSRN, with three novel modules is developed. (1) A sequential residual block is proposed to extract the sequential information of the text images. (2) A boundary-aware loss is designed to sharpen the character boundaries. (3) A central alignment module is proposed to relieve the misalignment problem in TextZoom. Extensive experiments on TextZoom demonstrate that our TSRN largely improves the recognition accuracy by over 13% of CRNN, and by nearly 9.0% of ASTER and MORAN compared to synthetic SR data. Furthermore, our TSRN clearly outperforms 7 state-of-the-art SR methods in boosting the recognition accuracy of LR images in TextZoom. For example, it outperforms LapSRN by over 5% and 8% on the recognition accuracy of ASTER and CRNN. Our results suggest that low-resolution text recognition in the wild is far from being solved, thus more research effort is needed.

机译：低分辨率文本图像通常在自然场景中看到，例如由移动电话捕获的文档。识别低分辨率文本图像是具有挑战性的，因为它们失去了详细的内容信息，导致识别准确性差。直观的解决方案是将超分辨率（SR）技术引入预处理。然而，先前的单个图像超分辨率（SISR）方法培训了合成低分辨率图像（例如双向采样），这很简单，不适合真正的低分辨率文本识别。为此，我们提出了一个真实的场景文本SR DataSet，称为TextZoom。它包含配对的实际低分辨率和高分辨率图像，该高分辨率图像由野外具有不同焦距的摄像机捕获。它比合成数据更真实和具有挑战性，如图1所示。我们争辩说，提高识别准确性是场景文本SR的最终目标。以此目的，开发了一种新的文本超分辨率网络，被称为TSRN，具有三种新颖的模块。（1）提出了一种顺序剩余块以提取文本图像的顺序信息。（2）旨在锐化的边界感知损耗锐化字符边界。（3）建议中央对准模块来缓解Textoom中的未对准问题。关于Textoom的广泛实验表明，我们的TSRN主要提高了超过13％的CRNN的识别准确性，与合成SR数据相比，艾斯特和莫兰的近9.0％。此外，我们的TSRN显然优于7种最先进的SR方法，提高了Textoom中LR图像的识别准确性。例如，它以抗ASTER和CRNN的识别准确度超过5％和8％而优于5％和8％。我们的结果表明，野外的低分辨率文本识别远未解决，因此需要更多的研究努力。

著录项

来源
《European Conference on Computer Vision》|2020年|650-666|共17页
会议地点
作者
Wenjia Wang; Enze Xie; Xuebo Liu; Wenhai Wang; Ding Liang; Chunhua Shen; Xiang Bai;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Scene text recognition; Super-resolution; Dataset; Sequence; Boundary;

机译：场景文本识别;超级分辨率;数据集;序列;边界;

相似文献

外文文献
中文文献
专利

1. A pooling based scene text proposal technique for scene text reading in the wild [J] . Dinh NguyenVan, Lu Shijian, Tian Shangxuan, Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：基于汇集的场景文本提案技术，用于野外的场景文本读数
2. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [J] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, Data in Brief . 2020,第3期

机译：Cursive-Text：自然场景图像中的端到端核心文本识别的全面数据集
3. Multi-script text versus non-text classification of regions in scene images [J] . Sriman Bowornrat, Schomaker Lambert Journal of visual communication & image representation . 2019,第JULa期

机译：场景图像中区域的多脚本文本与非文本分类
4. Selective Super-Resolution for Scene Text Images [C] . Ryo Nakao, Brian Kenji Iwana, Seiichi Uchida International Conference on Document Analysis and Recognition . 2019

机译：场景文本图像的选择性超分辨率
5. Unified detection and recognition for reading text in scene images [D] . Weinman, Jerod J. 2008

机译：统一检测和识别以读取场景图像中的文本
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. Scene Text Image Super-Resolution in the Wild [O] . Wenjia Wang, Enze Xie, Xuebo Liu, 2020

机译：场面文本图像超分辨率在野外
8. Method and Apparatus for Recognizing Text in an Image Sequence of Scene Imagery. [R] . Myers, G. K., Bolles, R. C., Luong, Q. T., 2006

机译：用于识别场景图像的图像序列中的文本的方法和装置。

Scene Text Image Super-Resolution in the Wild

摘要

著录项

相似文献

相关主题

期刊订阅