首页> 外文期刊>Pattern recognition letters >Robust shared feature learning for script and handwritten/machine-printed identification
【24h】

Robust shared feature learning for script and handwritten/machine-printed identification

机译:强大的共享功能学习功能,可用于脚本和手写/机印识别

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In this paper, we focus on the problem of script and handwritten/machine-printed identification of texts. We simultaneously identify the script (Chinese, English, Japanese, Korean, or Russian) and whether it is handwritten or machine-printed text by designing a dual-branch structured deep convolutional neural network (CNN). For the training stage, we propose a two-stage multi-task learning strategy to learn robust shared features for script and handwritten/machine-printed identification. Accordingly, we can implement two identification tasks using the proposed single CNN model. We compare the effects of using different length of input to train CNN. The experimental results show that text-line input is a suitable choice for the two identification tasks, as it can effectively capture more discriminative content for both script and handwritten/machine-printed identification. Furthermore, we evaluate three CNN networks of different scales (small, medium, and large) to determine the best CNN architecture for script and handwritten/machine-printed identification. As shown by our experimental validation, integrating the text-line input with larger architecture significantly improves performance. The accuracies achieved by the two-stage multi-task CNN for handwritten/machine-printed and script identification are 99% and 95%, respectively. (C) 2017 Elsevier B.V. All rights reserved.
机译:在本文中,我们重点关注脚本和手写/机器打印的文本识别问题。通过设计双分支结构的深度卷积神经网络(CNN),我们可以同时识别脚本(中文,英文,日文,韩文或俄文)以及它是手写的还是机器打印的文本。在培训阶段,我们提出了一个两阶段的多任务学习策略,以学习健壮的脚本和手写/机器打印识别共享功能。因此,我们可以使用建议的单个CNN模型执行两个识别任务。我们比较了使用不同长度的输入来训练CNN的效果。实验结果表明,文本行输入是两种识别任务的合适选择,因为它可以有效地捕获脚本和手写/机器打印识别的更多区分性内容。此外,我们评估了三个不同规模(小型,中型和大型)的CNN网络,以确定用于脚本和手写/机器打印识别的最佳CNN体系结构。如我们的实验验证所示,将文本行输入与更大的体系结构集成在一起可以显着提高性能。通过两阶段多任务CNN进行手写/机器打印和脚本识别的准确性分别为99%和95%。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Pattern recognition letters》 |2017年第1期|6-13|共8页
  • 作者单位

    South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China;

    South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China;

    South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China;

    South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Guangdong, Peoples R China;

    Fujitsu R&D Ctr Co Ltd, Beijing, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号