A Multi-task Network for Localization and Recognition of Text in Images

机译：用于本地化和图像中文本的定位的多任务网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an end-to-end trainable multi-task network that addresses the problem of lexicon-free text extraction from complex documents. This network simultaneously solves the problems of text localization and text recognition and text segments are identified with no post-processing, cropping, or word grouping. A convolutional backbone and Feature Pyramid Network are combined to provide a shared representation that benefits each of three model heads: text localization, classification, and text recognition. To improve recognition accuracy, we describe a dynamic pooling mechanism that retains high-resolution information across all RoIs. For text recognition, we propose a convolutional mechanism with attention which out-performs more common recurrent architectures. Our model is evaluated against benchmark datasets and comparable methods and achieves high performance in challenging regimes of non-traditional OCR.

机译：我们提供了一个端到端的培训多任务网络，解决了复杂文档的无词典文本提取问题。该网络同时解决了文本本地化问题，文本识别和文本段被识别，没有后处理，裁剪或单词分组。组合卷积骨干骨干和特征金字塔网络，提供共享表示，其有益于三个模型头：文本本地化，分类和文本识别。为了提高识别准确性，我们描述了一种动态池种机制，可在所有ROIS上保留高分辨率信息。对于文本认可，我们提出了一种卷积机制，注意哪个出于更常见的经常性架构。我们的模型是针对基准数据集和可比性方法评估的，并在非传统OCR的具有挑战性方面实现了高性能。

著录项

来源
《International Conference on Document Analysis and Recognition》|2019年|1 v.|共8页
会议地点
作者
Mohammad Reza Sarshogh; Keegan Hines;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术及设备;
关键词
Text recognition; Optical character recognition software; Feature extraction; Object detection; Head; Task analysis; Proposals;

机译：文本识别;光学字符识别软件;特征提取;对象检测;头;任务分析;提案;

相似文献

外文文献
中文文献
专利

1. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
2. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, 计算机、材料和连续体(英文) . 2019,第007期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
3. Image Splicing Localization using a Multi-task Fully Convolutional Network (MFCN) [J] . Salloum Ronald, Ren Yuzhuo, Kuo C. -C. Jay Journal of visual communication & image representation . 2018,第feba期

机译：使用多任务完全卷积网络（MFCN）的图像拼接本地化
4. A Multi-task Network for Localization and Recognition of Text in Images [C] . Mohammad Reza Sarshogh, Keegan Hines International Conference on Document Analysis and Recognition . 2019

机译：用于图像中文本的本地化和识别的多任务网络
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. An Algorithm Based on Text Position Correction and Encoder-Decoder Network for Text Recognition in the Scene Image of Visual Sensors [O] . Zhiwei Huang, Jinzhao Lin, Hongzhi Yang, 2020

机译：基于文本位置校正和编解码器网络的视觉传感器场景图像文本识别算法
7. Image Splicing Localization Using A Multi-Task Fully Convolutional Network (MFCN) [O] . Salloum, Ronald, Ren, Yuzhuo, Kuo, C. -C. Jay 2017

机译：使用多任务完全卷积的图像拼接定位网络（mFCN）

A Multi-task Network for Localization and Recognition of Text in Images

摘要

著录项

相似文献

相关主题

期刊订阅