首页> 外文会议> >A class-based language model for large-vocabulary speech recognition extracted from part-of-speech statistics
【24h】

A class-based language model for large-vocabulary speech recognition extracted from part-of-speech statistics

机译:从词性统计中提取用于大词汇语音识别的基于类的语言模型

获取原文

摘要

A novel approach is presented to class-based language modeling based on part-of-speech statistics. It uses a deterministic word-to-class mapping, which handles words with alternative part-of-speech assignments through the use of ambiguity classes. The predictive power of word-based language models and the generalization capability of class-based language models are combined using both linear interpolation and word-to-class backoff, and both methods are evaluated. Since each word belongs to one precisely ambiguity class, an exact word-to-class backoff model can easily be constructed. Empirical evaluations on large-vocabulary speech-recognition tasks show perplexity improvements and significant reductions in word error-rate.
机译:提出了一种新颖的方法,用于基于词性统计的基于类的语言建模。它使用确定性的词到类映射,该映射通过使用歧义类来处理具有替代词性分配的词。基于单词的语言模型的预测能力和基于类的语言模型的泛化能力通过线性插值和单词到类的退避相结合,并对这两种方法进行了评估。由于每个词都属于一个精确的歧义类别,因此可以轻松构建一个精确的词对类退避模型。对大词汇量语音识别任务的实证评估表明,困惑度得到改善,单词错误率显着降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号