A Multi-task Network for Localization and Recognition of Text in Images

机译：用于图像中文本的本地化和识别的多任务网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an end-to-end trainable multi-task network that addresses the problem of lexicon-free text extraction from complex documents. This network simultaneously solves the problems of text localization and text recognition and text segments are identified with no post-processing, cropping, or word grouping. A convolutional backbone and Feature Pyramid Network are combined to provide a shared representation that benefits each of three model heads: text localization, classification, and text recognition. To improve recognition accuracy, we describe a dynamic pooling mechanism that retains high-resolution information across all RoIs. For text recognition, we propose a convolutional mechanism with attention which out-performs more common recurrent architectures. Our model is evaluated against benchmark datasets and comparable methods and achieves high performance in challenging regimes of non-traditional OCR.

机译：我们提出了一个端到端的可训练多任务网络，该网络解决了从复杂文档中提取无词典文本的问题。该网络同时解决了文本本地化和文本识别的问题，并且无需后期处理，裁剪或单词分组即可识别文本段。卷积主干和特征金字塔网络相结合以提供共享的表示形式，这有利于三个模型头中的每一个：文本本地化，分类和文本识别。为了提高识别准确性，我们描述了一种动态池化机制，该机制可在所有RoI上保留高分辨率信息。对于文本识别，我们提出了一种具有关注度的卷积机制，其性能优于更常见的循环体系结构。我们的模型是根据基准数据集和可比较的方法进行评估的，并且在非传统OCR的挑战性条件下实现了高性能。

著录项

来源
《International Conference on Document Analysis and Recognition》|2019年|494-501|共8页
会议地点
作者
Mohammad Reza Sarshogh; Keegan Hines;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text recognition; Optical character recognition software; Feature extraction; Object detection; Head; Task analysis; Proposals;

机译：文本识别;光学字符识别软件;特征提取;目标检测;头部;任务分析;建议;
入库时间 2022-08-26 14:34:50

相似文献

外文文献
中文文献
专利

1. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
2. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, 计算机、材料和连续体(英文) . 2019,第007期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
3. Image Splicing Localization using a Multi-task Fully Convolutional Network (MFCN) [J] . Salloum Ronald, Ren Yuzhuo, Kuo C. -C. Jay Journal of visual communication & image representation . 2018,第feba期

机译：使用多任务完全卷积网络（MFCN）的图像拼接本地化
4. A Multi-task Network for Localization and Recognition of Text in Images [C] . Mohammad Reza Sarshogh, Keegan Hines International Conference on Document Analysis and Recognition . 2019

机译：用于本地化和图像中文本的定位的多任务网络
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. An Algorithm Based on Text Position Correction and Encoder-Decoder Network for Text Recognition in the Scene Image of Visual Sensors [O] . Zhiwei Huang, Jinzhao Lin, Hongzhi Yang, 2020

机译：基于文本位置校正和编解码器网络的视觉传感器场景图像文本识别算法
7. Image Splicing Localization Using A Multi-Task Fully Convolutional Network (MFCN) [O] . Salloum, Ronald, Ren, Yuzhuo, Kuo, C. -C. Jay 2017

机译：使用多任务完全卷积的图像拼接定位网络（mFCN）

A Multi-task Network for Localization and Recognition of Text in Images

摘要

著录项

相似文献

相关主题

期刊订阅