Sub-Word Based Mongolian Offline Handwriting Recognition

机译：基于子词的蒙古语离线手写识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Mongolian is an agglutinative language, which re-sults in a large number of words derived from the same stems connecting different suffixes. This morphological richness leads to high out-of-vocabulary (OOV) rates and causes problems of data sparsity. In this paper, our proposed recognition system is composed of three parts: handwritten image preprocessing, mapping of images to grapheme sequences, and sub-word-based language model (LM) decoding. We present a sub-word-based n-gram LM to solve the high OOV rate problem. According to the characteristics of Mongolian, we modified the traditional token passing algorithm to improve decoding speed and to easy to combine with any n-gram LM. We evaluated the performance of sub-words at different levels on the open Mongolian offline handwriting dataset (MHW). The bi-syllable 2-gram LM showed the best performance, with 18.32% and 23.22% word-error rates (WERs) on two test sets. Our various experiments show that, this method can predict in vocabulary words with a higher accuracy rate and also predict OOV words with a certain accuracy rate.

机译：蒙古语是一种凝集性语言，其产生的大量单词源自连接不同后缀的相同词干。这种形态上的丰富性导致高语音（OOV）率，并导致数据稀疏性问题。在本文中，我们提出的识别系统由三部分组成：手写图像预处理，图像到字素序列的映射以及基于子词的语言模型（LM）解码。我们提出一种基于子词的n-gram LM来解决高OOV率问题。根据蒙古文的特点，我们对传统的令牌传递算法进行了改进，以提高解码速度，并易于与任何n-gram LM组合。我们评估了蒙古在线离线手写数据集（MHW）上不同级别的子词的性能。双音节2克LM表现出最好的性能，在两个测试集上的单词错误率（WER）为18.32％和23.22％。我们的各种实验表明，该方法可以较高的准确率预测词汇单词，还可以以一定的准确率预测OOV单词。

著录项

来源
《International Conference on Document Analysis and Recognition》|2019年|246-253|共8页
会议地点
作者
Daoerji Fan; Guanglai Gao; Huijuan Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Handwriting recognition; Decoding; Hidden Markov models; Encoding; Shape; Vocabulary; Image segmentation;

机译：手写识别;解码;隐马尔可夫模型;编码;形状;词汇;图像分割;
入库时间 2022-08-26 14:34:50

相似文献

外文文献
中文文献
专利

1. Sub-word Based Offline Handwritten Farsi Word Recognition Using Recurrent Neural Network [J] . Mohammad Fazel Younessy Ghadikolaie, Ehsanolah Kabir, Farbod Razzazi ETRI journal . 2016,第4期

机译：递归神经网络的基于子词的离线手写波斯词识别
2. Offline Isolated Arabic Handwriting Character Recognition System Based on SVM [J] . Salam Mustafa, Hassan Alia Abdul The international arab journal of information technology . 2019,第3期

机译：基于SVM的离线孤立阿拉伯文手写字符识别系统。
3. Offline Isolated Arabic Handwriting Character Recognition System Based on SVM [J] . Salam Mustafa, Hassan Alia Abdul The international arab journal of information technology . 2019,第3期

机译：基于SVM的离线孤立的阿拉伯语手写字符识别系统
4. Sub-Word Based Mongolian Offline Handwriting Recognition [C] . Daoerji Fan, Guanglai Gao, Huijuan Wu International Conference on Document Analysis and Recognition . 2019

机译：基于子词的蒙古脱机手写识别
5. Warping-Based Approach to Offline Handwriting Recognition [D] . Kennard, Douglas J. 2013

机译：基于变形的离线手写识别方法
6. ClothFace: A Batteryless RFID-Based Textile Platform for Handwriting Recognition [O] . Han He, Xiaochen Chen, Adnan Mehmood, 2020

机译：布面：用于手写识别的无限rFID纺织平台
7. Confidence and Margin-Based MMI/MPE Discriminative Training for Offline Handwriting Recognition [O] . Dreuw, Philippe, Heigold, Georg, Ney, Hermann 2011

机译：基于信任度和边距的MMI / MPE判别训练用于离线手写识别

Sub-Word Based Mongolian Offline Handwriting Recognition

摘要

著录项

相似文献

相关主题

期刊订阅