...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Multi-task learning for simultaneous script identification and keyword spotting in document images
【24h】

Multi-task learning for simultaneous script identification and keyword spotting in document images

机译:多任务学习,用于同时脚本标识和文档图像中的关键字点

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, an end-to-end multi-task deep neural network was proposed for simultaneous script identification and Keyword Spotting (KWS) in multi-lingual hand-written and printed document images. We introduced a unified approach which addresses both challenges cohesively, by designing a novel CNNBLSTM architecture. The script identification stage involves local and global features extraction to allow the network to cover more relevant information. Contrarily to the traditional feature fusion approaches which build a linear feature concatenation, we employed a compact bi-linear pooling to capture pairwise correlations between these features. The script identification result is, then, injected in the KWS module to eliminate characters of irrelevant scripts and perform the decoding stage using a single-script mode. All the network parameters were trained in an end-to-end fashion using a multi-task learning that jointly minimizes the NLL loss for the script identification and the CTC loss for the KWS. Our approach was evaluated on a variety of public datasets of different languages and writing types.. Experiments proved the efficacy of our deep multi-task representation learning compared to the state-of-the-art systems for both of keyword spotting and script identification tasks. (c) 2021 Elsevier Ltd. All rights reserved.
机译:本文提出了一种端到端的多任务深度神经网络,用于多语言手写和打印文档图像中的同声脚本识别和关键字定位(KWS)。我们引入了一种统一的方法,通过设计一种新的CNNBLSTM体系结构,一致地解决了这两个挑战。脚本识别阶段涉及局部和全局特征提取,以允许网络覆盖更多相关信息。与传统的建立线性特征连接的特征融合方法相反,我们采用紧凑的双线性池来捕获这些特征之间的成对相关性。然后,将脚本识别结果注入KWS模块,以消除无关脚本的字符,并使用单个脚本模式执行解码阶段。所有网络参数都是使用多任务学习以端到端的方式进行训练的,该学习将脚本识别的NLL损失和KWS的CTC损失降至最低。我们的方法在不同语言和书写类型的各种公共数据集上进行了评估。。实验证明,与最先进的系统相比,我们的深度多任务表征学习对于关键词识别和脚本识别任务都是有效的。(c)2021爱思唯尔有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号