首页> 外文会议>Conference of the International Speech Communication Association >Language Modeling for Mixed Language Speech Recognition using Weighted Phrase Extraction
【24h】

Language Modeling for Mixed Language Speech Recognition using Weighted Phrase Extraction

机译:使用加权短语提取的混合语言语音识别语言建模

获取原文
获取外文期刊封面目录资料

摘要

To train a code switching language model for mixed language speech recognition, we propose to assign weights to the sentence pairs in the parallel text data. The code switching language model which is composed of the code switching boundary prediction model, code switching translation model and reconstruction model is incorporated with a language for mixed language speech recognition. The code switching translation model which is trained using selected subsets of the sentence pairs in the parallel text data allows the decoder to make the decision whether a phrase is in the matrix language or in the embedded language. Moreover, we propose a weighting procedure while training the code switching translation model. We evaluate our methods on Mandarin-English code switching lecture speech and lunch conversations. Our proposed method reduces word error rate by a statistically significant 1.74% on the lecture speech, and by 1.29% on the lunch conversation over the conventional interpolated language model.
机译:要培训用于混合语言语音识别的代码切换语言模型,我们建议将权重分配给并行文本数据中的句子对。由代码切换边界预测模型,代码切换转换模型和重建模型组成的代码切换语言模型与混合语言语音识别的语言结合着。在并行文本数据中使用句子对的所选子集接受训练的代码切换转换模型允许解码器进行矩阵语言或嵌入语言的决定。此外,我们提出了一种加权过程,同时训练代码切换翻译模型。我们评估我们关于普通话 - 英语代码切换演讲和午餐对话的方法。我们提出的方法通过讲话语音的统计学显着的1.74%,在常规内插语言模型上统计上有显着的1.74%,并在午餐对话中减少了统计学意义的1.29%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号