首页> 外文会议>2013 International Conference on Oriental COCOSDA >Traditional Chinese parser and language modeling for Mandadin ASR
【24h】

Traditional Chinese parser and language modeling for Mandadin ASR

机译:曼达丁ASR的繁体中文解析器和语言建模

获取原文
获取原文并翻译 | 示例

摘要

A new approach of traditional Chinese parser to improving the language modeling of Mandarin speech recognition is proposed in this paper. The parser first uses a preprocessing to correct some word segmentation inconsistencies of the text corpus. It then employs a CRF-based word segmentation method and a CRF-based POS tagger to resegment the texts so as to generate better word strings for training an n-gram language model (LM) for ASR. Experimental results on the TCC-300 corpus showed that a word error rate (WER) of 13.4% was achieved by the proposed method. It is about 45% improvement on the relative WER reduction as compared with the previous system.
机译:提出了一种改进中文普通话语音识别语言建模的传统中文解析器方法。解析器首先使用预处理来纠正文本语料库的某些分词不一致。然后,它采用基于CRF的单词分割方法和基于CRF的POS标记器对文本进行重新分段,以生成更好的单词串,以训练用于ASR的n元语法模型(LM)。在TCC-300语料库上的实验结果表明,该方法实现了13.4%的单词错误率。与以前的系统相比,相对WER降低了约45%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号