Self-Training of BLSTM with Lexicon Verification for Handwriting Recognition

机译：BLSTM的自训练与词汇验证，用于手写识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep learning approaches now provide state-of-the-art performance in many computer vision tasks such as handwriting recognition. However, the huge number of parameters of these models require big annotated training datasets which are difficult to obtain. Training neural networks with unlabeled data is one of the key problems to achieve significant progress in deep learning. In this article, we explore a new semi-supervised training strategy to train long-short term memory (LSTM) recurrent neural networks for isolated handwritten words recognition. The idea of our self-training strategy relies on the iteration of training Bidirectional LSTM recurrent neural network (BLSTM) using both labeled and unlabeled data. At each iteration the current trained network labels the unlabeled data and submit them to a very efficient "lexicon verification" rule. Verified unlabeled data are added to the labeled dataset at the end of each iteration. This verification stage has very low sensitivity to the lexicon size, and a full word coverage of the dataset is not necessary to make the semi-supervised method efficient. The strategy enables self-training with a single BLSTM and show promising results on the Rimes dataset.

机译：深度学习方法现在可以在许多计算机视觉任务（例如手写识别）中提供最先进的性能。但是，这些模型的大量参数需要使用大量带注释的训练数据集，而这些数据集很难获得。用未标记的数据训练神经网络是在深度学习中取得重大进展的关键问题之一。在本文中，我们探索了一种新的半监督训练策略，以训练用于分离的手写单词识别的长短期记忆（LSTM）递归神经网络。我们的自训练策略的思想依赖于使用标记的和未标记的数据训练双向LSTM递归神经网络（BLSTM）的迭代。在每次迭代中，当前训练有素的网络会标记未标记的数据，并将其提交给非常有效的“词典验证”规则。在每次迭代结束时，将经过验证的未标记数据添加到标记数据集中。该验证阶段对词典大小的敏感性非常低，并且对于使半监督方法有效，没有必要对数据集进行完整的单词覆盖。该策略可通过单个BLSTM进行自我训练，并在Rimes数据集上显示出令人鼓舞的结果。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|633-638|共6页
会议地点
作者
Bruno Stuner; Clément Chatelain; Thierry Paquet;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Semisupervised learning; Handwriting recognition; Character recognition; Recurrent neural networks; Error analysis;

机译：培训;半监督学习;手写识别;字符识别;递归神经网络;错误分析;

相似文献

外文文献
中文文献
专利

1. Keyword spotting for self-training of BLSTM NN based handwriting recognition systems [J] . Volkmar Frinken, Andreas Fischer, Markus Baumgartner, Pattern Recognition: The Journal of the Pattern Recognition Society . 2014,第3期

机译：基于LSTM CNN的手写识别系统自训练的关键字识别
2. Handwriting recognition using cohort of LSTM and lexicon verification with extremely large lexicon [J] . Bruno Stuner, Clement Chatelain, Thierry Paquet Multimedia Tools and Applications . 2020,第45a46期

机译：使用LSTM队列和Lexicon验证具有极大的Lexicon的手写识别
3. Preparation of an Unconstrained Vietnamese Online Handwriting Database and Recognition Experiments by BLSTM [J] . Hung Tuan NGUYEN, Cuong Tuan NGUYEN, Pham The BAO, 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2015,第517期

机译：BLSTM建立不受约束的越南在线手写数据库和识别实验
4. Self-Training of BLSTM with Lexicon Verification for Handwriting Recognition [C] . Bruno Stuner, Clément Chatelain, Thierry Paquet IAPR International Conference on Document Analysis and Recognition . 2017

机译：BLSTM的自我训练与语法识别的词典验证
5. Feature design and lexicon reduction for efficient offline handwriting recognition. [D] . Chherawala, Youssouf. 2014

机译：功能设计和词典缩减功能可实现高效的离线手写识别。
6. Robust Semi-Supervised Traffic Sign Recognition via Self-Training and Weakly-Supervised Learning [O] . Obed Tettey Nartey, Guowu Yang, Sarpong Kwadwo Asare, 2020

机译：通过自我训练和弱监督学习实现可靠的半监督交通标志识别
7. Feature design and lexicon reduction for efficient offline handwriting recognition [O] . Chherawala Youssouf 2014

机译：功能设计和词典缩减，实现高效的离线手写识别

Self-Training of BLSTM with Lexicon Verification for Handwriting Recognition

摘要

著录项

相似文献

相关主题

期刊订阅