首页> 外文期刊>Expert systems with applications >Towards efficient unconstrained handwriting recognition using Dilated Temporal Convolution Network
【24h】

Towards efficient unconstrained handwriting recognition using Dilated Temporal Convolution Network

机译:使用扩张时间卷积网络实现高效的无约束手写识别

获取原文
获取原文并翻译 | 示例

摘要

Recognition of cursive handwritten images has advanced well with recent recurrent architectures and attention mechanism. Most of the works focus on improving transcription performance in terms of Character Error Rate (CER) and Word Error Rate (WER). Existing models are too slow to train and test networks. Furthermore, recent studies have recommended models be not only efficient in terms of task performance but also environmentally friendly in terms of model carbon footprint. Reviewing the recent state-of-the-art models, it recommends considering model training and retraining time while designing. High training time increases costs not only in terms of resources but also in carbon footprint. This becomes challenging for handwriting recognition model with popular recurrent architectures. It is truly critical since line images usually have a very long width resulting in a longer sequence to decode. In this work, we present a fully convolution based deep network architecture for cursive handwriting recognition from line level images. The architecture is a combination of 2-D convolutions and 1-D dilated non causal convolutions with Connectionist Temporal Classification (CTC) output layer. This offers a high parallelism with a smaller number of parameters. We further demonstrate experiments with various re-scaling factors of the images and how it affects the performance of the proposed model. A data augmentation pipeline is further analyzed while model training. The experiments show our model, has comparable performance on CER and WER measures with recurrent architectures. A comparison is done with state-of-the-art models with different architectures based on Recurrent Neural Networks (RNN) and its variants. The analysis shows training performance and network details of three different dataset of English and French handwriting. This shows our model has fewer parameters and takes less training and testing time, making it suitable for low-resource and environment-friendly deployment.
机译:识别法学手写图像具有近期经常性架构和关注机制的先进。大多数工作都侧重于在字符错误率(CER)和Word错误率(WER)方面提高转录性能。现有模型太慢地训练和测试网络。此外,最近的研究推荐模型不仅在任务性能方面效率,而且在模型碳足迹方面也是环保的。审查最近的最先进的模型,它建议在设计时考虑模型培训和再培训时间。高培训时间不仅增加了资源而且在碳足迹方面增加了成本。这对具有流行复发体系结构的手写识别模型具有挑战性。它真正关键,因为线图像通常具有非常长的宽度,导致更长的序列来解码。在这项工作中,我们介绍了一种基于卷积的深度网络架构,用于从线路级别图像进行练习手写识别。该体系结构是2-D卷积和1-D具有连接员时间分类(CTC)输出层的1-D扩展非因果卷曲的组合。这提供了具有较少数量的参数的高行性。我们进一步证明了图像的各种重新缩放因子的实验以及它如何影响所提出的模型的性能。在模型训练的同时进一步分析数据增强管道。实验表明我们的型号,对Cer和WER措施进行了相当的性能,具有经常性架构。使用基于经常性神经网络(RNN)及其变体的不同架构的最先进模型进行了比较。分析显示了英语和法国手写三个不同数据集的培训表现和网络详细信息。这表明我们的模型具有较少的参数,并且需要较少的培训和测试时间,适用于低资源和环境友好部署。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号