首页> 外文期刊>Expert Systems with Application >Text-line extraction from handwritten document images using GAN
【24h】

Text-line extraction from handwritten document images using GAN

机译:使用GAN从手写文档图像中提取文本行

获取原文
获取原文并翻译 | 示例
           

摘要

Text-line extraction (TLE) from unconstrained handwritten document images is still considered an open research problem. Literature survey reveals that use of various rule-based methods is commonplace in this regard. But these methods mostly fail when the document images have touching and/or multi skewed text lines or overlapping words/characters and non-uniform inter-line space. To encounter this problem, in this paper, we have used a deep learning-based method. In doing so, we have, for the first time in the literature, applied Generative Adversarial Networks (GANs) where we have considered TLE as image-to-image translation task. We have used U-Net architecture for the Generator, and Patch GAN architecture for the discriminator with different combinations of loss functions namely GAN loss, L1 loss and L2 loss. Evaluation is done on two datasets: handwritten Chinese text dataset HIT-MW and ICDAR 2013 Handwritten Segmentation Contest dataset. After exhaustive experimentations, it has been observed, that U-Net architecture with combination of the said three losses not only produces impressive results but also outperforms some state-of-the-art methods. (C) 2019 Elsevier Ltd. All rights reserved.
机译:从不受约束的手写文档图像中提取文本行(TLE)仍然被认为是一个开放的研究问题。文献调查表明,在这方面,使用各种基于规则的方法是司空见惯的。但是,当文档图像具有触摸和/或倾斜的文本行或重叠的单词/字符和行距不均匀时,这些方法通常会失败。为了解决这个问题,在本文中,我们使用了一种基于深度学习的方法。为此,我们首次在文献中应用了生成对抗网络(GAN),在该网络中,我们将TLE视为图像到图像的翻译任务。我们将U-Net架构用于生成器,将Patch GAN架构用于区分器,这些函数具有不同的损失函数组合,即GAN损失,L1损失和L2损失。对两个数据集进行评估:手写中文文本数据集HIT-MW和ICDAR 2013手写分割竞赛数据集。经过详尽的实验,已经观察到,结合了上述三种损失的U-Net架构不仅产生了令人印象深刻的结果,而且还超过了一些最新方法。 (C)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号