首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >DNN-HMM for Large Vocabulary Mongolian Offline Handwriting Recognition
【24h】

DNN-HMM for Large Vocabulary Mongolian Offline Handwriting Recognition

机译:DNN-HMM用于大词汇蒙古脱机手写识别

获取原文

摘要

In this paper, we propose a large vocabulary Mongolian offline handwriting recognition system, using hidden Markov models (HMMs)-deep neural networks (DNN) hybrid architectures which shows superior performance on auto speech recognize (ASR) tasks. We select 50 sub-characters from all shape of Mongolian letters as the smallest modeling unit. First, a set of intensity features are extracted from each of the segmented word, which is based on a sliding window moving across each word image. Then, Multiple contextdependent Gaussian mixture model (GMM)-HMMs are trained by the features. At last a DNN which have 4 hidden layers are trained as a frame classifier, where the class labels are state labels assigned to each input frame through forced alignment using the context-dependent model. In order to validate the proposed model, extensive experiments were carried out using the MHW database which contains 100,000 handwritten words in training set, 5,000 in test set I and 14,085 in Test set II. The DNN-HMM w hich is trained on raw image pixels yields best performance on Test set I with an accuracy of 97.61% and on Test set II with an accuracy of 94.14%.
机译:在本文中,我们使用隐藏的马尔可夫模型(HMM)-深层神经网络(DNN)混合体系结构,提出了一个大词汇量的蒙古语脱机手写识别系统,该系统在自动语音识别(ASR)任务上显示出卓越的性能。我们从各种形状的蒙古字母中选择50个子字符作为最小的建模单位。首先,基于在每个单词图像上移动的滑动窗口,从每个分割的单词中提取一组强度特征。然后,通过特征训练多个上下文相关的高斯混合模型(GMM)-HMM。最后,将具有4个隐藏层的DNN作为帧分类器进行训练,其中,类标签是使用上下文相关模型通过强制对齐分配给每个输入帧的状态标签。为了验证所提出的模型,使用MHW数据库进行了广泛的实验,该数据库在训练集中包含100,000个手写单词,在测试集I中包含5,000个手写单词,在测试集II中包含14,085个手写单词。在原始图像像素上对DNN-HMM进行了训练,从而在测试集I上以97.61%的精度产生了最佳性能,在测试集II上以94.14%的精度产生了最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号