首页> 外文会议>European Conference on Computer Vision >Scene Text Image Super-Resolution in the Wild
【24h】

Scene Text Image Super-Resolution in the Wild

机译:场面文本图像超分辨率在野外

获取原文

摘要

Low-resolution text images are often seen in natural scenes such as documents captured by mobile phones. Recognizing low-resolution text images is challenging because they lose detailed content information, leading to poor recognition accuracy. An intuitive solution is to introduce super-resolution (SR) techniques as pre-processing. However, previous single image super-resolution (SISR) methods are trained on synthetic low-resolution images (e.g. Bicubic down-sampling), which is simple and not suitable for real low-resolution text recognition. To this end, we propose a real scene text SR dataset, termed TextZoom. It contains paired real low-resolution and high-resolution images which are captured by cameras with different focal length in the wild. It is more authentic and challenging than synthetic data, as shown in Fig. 1. We argue improving the recognition accuracy is the ultimate goal for Scene Text SR. In this purpose, a new Text Super-Resolution Network, termed TSRN, with three novel modules is developed. (1) A sequential residual block is proposed to extract the sequential information of the text images. (2) A boundary-aware loss is designed to sharpen the character boundaries. (3) A central alignment module is proposed to relieve the misalignment problem in TextZoom. Extensive experiments on TextZoom demonstrate that our TSRN largely improves the recognition accuracy by over 13% of CRNN, and by nearly 9.0% of ASTER and MORAN compared to synthetic SR data. Furthermore, our TSRN clearly outperforms 7 state-of-the-art SR methods in boosting the recognition accuracy of LR images in TextZoom. For example, it outperforms LapSRN by over 5% and 8% on the recognition accuracy of ASTER and CRNN. Our results suggest that low-resolution text recognition in the wild is far from being solved, thus more research effort is needed.
机译:低分辨率文本图像通常在自然场景中看到,例如由移动电话捕获的文档。识别低分辨率文本图像是具有挑战性的,因为它们失去了详细的内容信息,导致识别准确性差。直观的解决方案是将超分辨率(SR)技术引入预处理。然而,先前的单个图像超分辨率(SISR)方法培训了合成低分辨率图像(例如双向采样),这很简单,不适合真正的低分辨率文本识别。为此,我们提出了一个真实的场景文本SR DataSet,称为TextZoom。它包含配对的实际低分辨率和高分辨率图像,该高分辨率图像由野外具有不同焦距的摄像机捕获。它比合成数据更真实和具有挑战性,如图1所示。我们争辩说,提高识别准确性是场景文本SR的最终目标。以此目的,开发了一种新的文本超分辨率网络,被称为TSRN,具有三种新颖的模块。 (1)提出了一种顺序剩余块以提取文本图像的顺序信息。 (2)旨在锐化的边界感知损耗锐化字符边界。 (3)建议中央对准模块来缓解Textoom中的未对准问题。关于Textoom的广泛实验表明,我们的TSRN主要提高了超过13%的CRNN的识别准确性,与合成SR数据相比,艾斯特和莫兰的近9.0%。此外,我们的TSRN显然优于7种最先进的SR方法,提高了Textoom中LR图像的识别准确性。例如,它以抗ASTER和CRNN的识别准确度超过5%和8%而优于5%和8%。我们的结果表明,野外的低分辨率文本识别远未解决,因此需要更多的研究努力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号