...
首页> 外文期刊>Expert systems with applications >A novel pipeline framework for multi oriented scene text image detection and recognition
【24h】

A novel pipeline framework for multi oriented scene text image detection and recognition

机译:用于多面向场景文本图像检测和识别的新型管道框架

获取原文
获取原文并翻译 | 示例
           

摘要

Automatic text detection and recognition (end-to-end text recognition) in real-life images are the main elements of many applications including blind and low vision assistance systems and self-driving cars. However, it is challenging to detect curved and vertical texts due to their color bleeding, font size variation, and complicated background. In this paper, a convolutional neural network-based pipeline is introduced to obtain high-level visual features and improve text detection and recognition efficiency. A pre-trained ResNet-50 network on ImageNet and SynthText for extracting low-level visual features was used in this study. Moreover, new improved ReLU layer (new.i.ReLU) blocks are used with a varied receptive field with a strong ability to detect text components even on curved surfaces in the proposed structure. A new improved inception layer (new.i.inception layers) can obtain broadly varying-sized text more effectively than a linear chain of convolution layer. Also, we have proposed a pipeline framework for character recognition that is robust to irregular (curve and vertical) text. First, we introduced a novel algorithm for encoding pixel's value to a new one called local word directional pattern (LWDP) that highlights the texture of the characters. Then, the output of LWDP was presented as an input image in the text recognition process. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015, and ICDAR 2019 datasets, illustrated the superiority of the proposed architecture over prior works.
机译:现实寿命图像中的自动文本检测和识别(端到端文本识别)是许多应用程序的主要元素,包括盲和低视力辅助系统和自动驾驶汽车。然而,由于它们的颜色出血,字体大小变化和复杂的背景,检测弯曲和垂直文本是挑战性的。本文介绍了一种基于卷积神经网络的流水线,以获得高级视觉特征,提高文本检测和识别效率。本研究使用了一个预先训练的Reset-50网络,用于提取低级视觉功能的Imagenet和Synthtext。此外,新的改进的Relu层(New.i.Relu)块与变化的接收领域一起使用,具有较强的能力,即使在所提出的结构中的弯曲表面上也能够检测文本组件。新的改进的初始化层(New.I.Inception层)可以比卷积层的线性链更有效地获得广泛变化的大小文本。此外,我们已经提出了一种用于字符识别的管道框架,其对不规则(曲线和垂直)文本具有鲁棒性。首先,我们介绍了一种用于将像素的值编码为新的一个名为局部单词方向模式(LWDP)的新颖算法,该算法突出显示字符的纹理。然后,将LWDP的输出显示为文本识别过程中的输入图像。在标准基准的实验,包括ICDAR 2013,ICDAR 2015和ICDAR 2019 Datasets,介绍了拟议的建筑的优势。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号