首页> 外文会议> >Influence of language models and candidate set size on contextual post-processing for Chinese script recognition
【24h】

Influence of language models and candidate set size on contextual post-processing for Chinese script recognition

机译:语言模型和候选集大小对中文脚本识别的上下文后处理的影响

获取原文
获取外文期刊封面目录资料

摘要

In the Chinese language, a word consisting of one or more characters is a basic syntax-meaningful unit, however, each character in the word also has a definite meaning in itself. We compare the perplexities of four n-gram language models (character-based bigram, character-based trigram, word-based bigram and class-based bigram) and their influence on the performance of contextual post-processing of Chinese scripts in an offline handwritten Chinese character recognition system. We also demonstrate the influence of the candidate set size on the performance of contextual post-processing in detail, and indicate that the number of candidates should vary with each script.
机译:在中文中,由一个或多个字符组成的单词是基本的有意义的语法单元,但是单词中的每个字符本身也具有确定的含义。我们比较了四种n语法语言模型(基于字符的双字母组,基于字符的三字母组,基于单词的双字母组和基于类的双字母组)的困惑及其对离线手写汉字的上下文后处理性能的影响。汉字识别系统。我们还将详细演示候选集大小对上下文后处理性能的影响,并指出候选数量应随每个脚本而变化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号