首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
【24h】

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

机译:掩码TextSpotter:用于以任意形状的方式发现文本的端到端培训神经网络

获取原文
获取原文并翻译 | 示例

摘要

Unifying text detection and text recognition in an end-to-end training fashion has become a new trend for reading text in the wild, as these two tasks are highly relevant and complementary. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network named as Mask TextSpotter is presented. Different from the previous text spotters that follow the pipeline consisting of a proposal generation network and a sequence-to-sequence recognition network, Mask TextSpotter enjoys a simple and smooth end-to-end learning procedure, in which both detection and recognition can be achieved directly from two-dimensional space via semantic segmentation. Further, a spatial attention module is proposed to enhance the performance and universality. Benefiting from the proposed two-dimensional representation on both detection and recognition, it easily handles text instances of irregular shapes, for instance, curved text. We evaluate it on four English datasets and one multi-language dataset, achieving consistently superior performance over state-of-the-art methods in both detection and end-to-end text recognition tasks. Moreover, we further investigate the recognition module of our method separately, which significantly outperforms state-of-the-art methods on both regular and irregular text datasets for scene text recognition.
机译:在端到端培训时尚中统一文本检测和文本识别已成为野外阅读文本的新趋势,因为这两个任务是高度相关和互补的。在本文中,我们调查了场景文本发现的问题,其目的是在自然图像中同时进行文本检测和识别。提出了一个名为Mask TexteSpotter的端到端培训神经网络。与前一篇文章特色不同的文本特征在于遵循由提案生成网络和序列到序列识别网络的管道,掩码TextSpotter享有简单且平滑的端到端学习过程,其中可以实现检测和识别通过语义分割直接来自二维空间。此外,提出了一种空间注意模块来增强性能和普遍性。受益于检测和识别的提出的二维表示,它很容易处理不规则形状的文本实例,例如曲线文本。我们在四个英语数据集和一个多语言数据集中评估它,在检测和端到端文本识别任务中实现了一贯过于最先进的方法的卓越性能。此外,我们进一步详细研究了我们方法的识别模块,这显着优于常规和不规则文本数据集的最先进的方法,用于场景文本识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号