首页> 外文会议>International Conference on Document Analysis and Recognition >EATEN: Entity-Aware Attention for Single Shot Visual Text Extraction
【24h】

EATEN: Entity-Aware Attention for Single Shot Visual Text Extraction

机译:就餐:单次视觉文本提取中的实体感知注意

获取原文

摘要

Extracting Text of Interest (ToI) from images is a crucial part of many OCR applications, such as entity recognition of cards, invoices, and receipts. Most of the existing works employ complicated engineering pipeline, which contains OCR and structure information extraction, to fulfill this task. This paper proposes an Entity-aware Attention Text Extraction Network called EATEN, which is an end-to-end trainable system to extract the ToIs without any post-processing. In the proposed framework, each entity is parsed by its corresponding entity-aware decoder, respectively. Moreover, we innovatively introduce a state transition mechanism which further improves the robustness of visual ToI extraction. In consideration of the absence of public benchmarks, we construct a dataset of almost 0.6 million images in three real-world scenarios (train ticket, passport and business card), which is publicly available at https://github.com/beacandler/EATEN. To the best of our knowledge, EATEN is the first single shot method to extract entities from images. Extensive experiments on these benchmarks demonstrate the state-of-the-art performance of EATEN.
机译:从图像中提取感兴趣的文本(TOI)是许多OCR应用程序的重要组成部分,例如卡片,发票和收据的实体识别。大多数现有工程采用复杂的工程管道,其中包含OCR和结构信息提取,以满足这项任务。本文提出了一个名为EATEN的实体感知注意文本提取网络,这是一个端到端的培训系统,可以在没有任何后处理的情况下提取TOIS。在所提出的框架中,每个实体分别由其相应的实体感知解码器解析。此外,我们创新地引入了一种状态转换机制,该机制进一步提高了视觉TOI提取的鲁棒性。考虑到缺乏公共基准,我们在三个现实世界场景(火车票,护照和名片)中构建了近60万个图像的数据集,该数据集在HTTPS://github.com/beacandler/eaten公开提供。据我们所知,Eaten是第一个从图像中提取实体的单一拍摄方法。这些基准的广泛实验证明了食用的最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号