首页> 外文会议>International Conference on Speech and Computer >A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian
【24h】

A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian

机译:塞尔维亚持续语音识别系统中语言模型训练技术的比较

获取原文

摘要

In this paper, a number of language model training techniques will be examined and utilized in a large vocabulary continuous speech recognition system for the Serbian language (more than 120000 words), namely Mikolov and Yandex RNNLM, TensorFlow based GPU approaches and CUED-RNNLM approach. The baseline acoustic model is a chain sub-sampled time delayed neural network, trained using cross-entropy training and a sequence-level objective function on a database of about 200 h of speech. The baseline language model is a 3-gram model trained on the training part of the database transcriptions and the Serbian journalistic corpus (about 600000 utterances), using the SRILM toolkit and the Kneser-Ney smoothing method, with a pruning value of 10~(-7) (previous best). The results are analyzed in terms of word and character error rates and the perplexity of a given language model on training and validation sets. Relative improvement of 22.4% (best word error rate of 7.25%) is obtained in comparison to the baseline language model.
机译:在本文中,将在塞尔维亚语言(超过120000字)的大词汇表连续语音识别系统中检查和利用许多语言模型培训技术,即Mikolov和Yandex RNNLM,基于Tensorflow的GPU方法和CUED-RNNLM方法。基线声学模型是一种链子采样时间延迟神经网络,使用跨熵培训和序列级客观函数在大约200小时的语音的数据库中进行训练。基线语言模型是一款3克模型,培训了培训部分数据库转录和塞尔维亚新闻语料库(约60万个话语),使用SRILM Toolkit和Kneser-Ney平滑方法,修剪值为10〜( -7)(以前最好)。在训练和验证集上的单词和字符误差率和给定语言模型的困惑方面分析了结果。与基线语言模型相比,获得了22.4%的相对提高(最佳字错误率为7.25%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号