Improving Handwritten Chinese Text Recognition by Unsupervised Language Model Adaptation

机译：通过无监督语言模型适应改进手写中文文本识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates the effects of unsupervised language model adaptation (LMA) in handwritten Chinese text recognition. For no prior information of recognition text is available, we use a two-pass recognition strategy. In the first pass, the generic language model (LM) is used to get a preliminary result, which is used to choose the best matched LMs from a set of pre-defined domains, then the matched LMs are used in the second pass recognition. Each LM is compressed to a moderate size via the entropy-based pruning, tree-structure formatting and fewer-byte quantization. We evaluated the LMA for five LM types, including both character-level and word-level ones. Experiments on the CASIA-HWDB database show that language model adaptation improves the performance for each LM type in all domains. The documents of ancient domain gained the biggest improvement of character-level correct rate of 5.87 percent up and accurate rate of 6.05 percent up.

机译：本文调查了无监督语言模型适应（LMA）在手写中文文本识别中的影响。对于无法使用的识别文本的先前信息，我们使用双通识别策略。在第一次通过中，通用语言模型（LM）用于获得初步结果，该初步结果用于从一组预定义域中选择最佳匹配的LMS，然后在第二传递识别中使用匹配的LMS。每个LM通过基于熵的修剪，树木结构格式和更少的字节量化压缩到中等大小。我们评估了LMA的五个LM类型，包括字符级和字级别。 CASIA-HWDB数据库的实验表明，语言模型适配可提高所有域中每个LM类型的性能。古代领域的文件获得了性质级别的最大提高了5.87％，准确率为6.05％。

著录项

来源
《IAPR International Workshop on Document Analysis Systems》|2012年||共5页
会议地点
作者
Qiu-Feng Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391-53;
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised language model adaptation for handwritten Chinese text recognition [J] . Qiu-Feng Wang, Fei Yin, Cheng-Lin Liu Pattern Recognition: The Journal of the Pattern Recognition Society . 2014,第3期

机译：手写中文识别的无监督语言模型自适应
2. Unsupervised writer adaptation applied to handwritten text recognition [J] . Nosary A, Heutte L, Paquet T Pattern Recognition: The Journal of the Pattern Recognition Society . 2004,第2期

机译：无人监督作者改编应用于手写文本识别
3. Integration Of N-gram Language Models Inmultiple Classifier Systems For Offline handwritten Text Line Recognition [J] . ROMAN BERTOLAMI, HORST BUNKE International Journal of Pattern Recognition and Artificial Intelligence . 2008,第7期

机译：N-gram语言模型在多个分类器系统中的集成，用于离线手写文本行识别
4. Improving Handwritten Chinese Text Recognition by Unsupervised Language Model Adaptation [C] . Qiu-Feng Wang Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on . 2012

机译：通过无监督语言模型自适应改进手写中文文本的识别
5. From Translation to Adaptation: Chinese Language Texts and Early Modern Japanese Literature [D] . Hartmann, Nan Ma 2014

机译：从翻译到适应：中文文本和早期现代日本文学
6. Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text [O] . Jing Xu, Liang Gan, Mian Cheng, 2018

机译：中文在线医学文本中的无监督医学实体识别与链接
7. Integrating Language Model in Handwritten Chinese Text Recognition [O] . Qiu-feng Wang, Fei Yin, Cheng-lin Liu 2015

机译：语言模型在手写中文文本识别中的整合

Improving Handwritten Chinese Text Recognition by Unsupervised Language Model Adaptation

摘要

著录项

相似文献

相关主题

期刊订阅