A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian

机译：塞尔维亚持续语音识别系统中语言模型训练技术的比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a number of language model training techniques will be examined and utilized in a large vocabulary continuous speech recognition system for the Serbian language (more than 120000 words), namely Mikolov and Yandex RNNLM, TensorFlow based GPU approaches and CUED-RNNLM approach. The baseline acoustic model is a chain sub-sampled time delayed neural network, trained using cross-entropy training and a sequence-level objective function on a database of about 200 h of speech. The baseline language model is a 3-gram model trained on the training part of the database transcriptions and the Serbian journalistic corpus (about 600000 utterances), using the SRILM toolkit and the Kneser-Ney smoothing method, with a pruning value of 10~(-7) (previous best). The results are analyzed in terms of word and character error rates and the perplexity of a given language model on training and validation sets. Relative improvement of 22.4% (best word error rate of 7.25%) is obtained in comparison to the baseline language model.

机译：在本文中，将在塞尔维亚语言（超过120000字）的大词汇表连续语音识别系统中检查和利用许多语言模型培训技术，即Mikolov和Yandex RNNLM，基于Tensorflow的GPU方法和CUED-RNNLM方法。基线声学模型是一种链子采样时间延迟神经网络，使用跨熵培训和序列级客观函数在大约200小时的语音的数据库中进行训练。基线语言模型是一款3克模型，培训了培训部分数据库转录和塞尔维亚新闻语料库（约60万个话语），使用SRILM Toolkit和Kneser-Ney平滑方法，修剪值为10〜（ -7）（以前最好）。在训练和验证集上的单词和字符误差率和给定语言模型的困惑方面分析了结果。与基线语言模型相比，获得了22.4％的相对提高（最佳字错误率为7.25％）。

著录项

来源
《International Conference on Speech and Computer》|2018年|xv 791 p.|共10页
会议地点
作者
Branislav Popovic; Edvin Pakoci; Darko Pekar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Language modeling; RNNLM; LSTM; LVCSR;

机译：语言建模;rnnlm;lstm;lvcsr;

相似文献

外文文献
中文文献
专利

1. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
2. A study of neural network Russian language models for automatic continuous speech recognition systems [J] . Kipyatkova I. S., Karpov A. A. Automation and Remote Control . 2017,第5期

机译：自动持续语音识别系统神经网络俄语模型的研究
3. Building Statistical Language Models for Persian Continuous Speech Recognition Systems Using the Peykare Corpus [J] . Mohammad Bahrani, Hossein Sameti International journal of computer processing of languages . 2011,第1期

机译：使用Peykare语料库为波斯语连续语音识别系统建立统计语言模型
4. A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian [C] . Branislav Popovic, Edvin Pakoci, Darko Pekar International Conference on speech and computer . 2018

机译：塞尔维亚语连续语音识别系统中语言模型训练技术的比较
5. Discriminative training of language models for speech recognition . [D] . Magdin, Vladimir. 2010

机译：语音识别语言模型的判别训练。
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. Improving Continuous Sign Language Recognition: Speech Recognition Techniques and System Design [O] . Forster Jens, Koller Oscar, Oberdörfer Christian, 2013

机译：改进连续手语识别：语音识别技术和系统设计

A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian

摘要

著录项

相似文献

相关主题

期刊订阅