A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian

机译：塞尔维亚语连续语音识别系统中语言模型训练技术的比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a number of language model training techniques will be examined and utilized in a large vocabulary continuous speech recognition system for the Serbian language (more than 120000 words), namely Mikolov and Yandex RNNLM, TensorFlow based GPU approaches and CUED-RNNLM approach. The baseline acoustic model is a chain sub-sampled time delayed neural network, trained using cross-entropy training and a sequence-level objective function on a database of about 200 h of speech. The baseline language model is a 3-gram model trained on the training part of the database transcriptions and the Serbian journalistic corpus (about 600000 utterances), using the SRILM toolkit and the Kneser-Ney smoothing method, with a pruning value of 10~(-7) (previous best). The results are analyzed in terms of word and character error rates and the perplexity of a given language model on training and validation sets. Relative improvement of 22.4% (best word error rate of 7.25%) is obtained in comparison to the baseline language model.

机译：在本文中，将对大型语言连续语音识别系统（超过120000个单词）中的塞尔维亚语（Mikolov和Yandex RNNLM），基于TensorFlow的GPU方法和CUED-RNNLM方法的大型语言连续语音识别系统进行研究和使用多种语言模型训练技术。。基线声学模型是一个链子采样的时延神经网络，使用交叉熵训练和序列级目标函数在约200 h语音的数据库上进行训练。基准语言模型是一个3克模型，使用SRILM工具包和Kneser-Ney平滑方法在数据库转录和塞尔维亚新闻语料库（约60万个发音）的训练部分上训练，修剪值为10〜（ -7）（以前的最佳）。根据单词和字符的错误率以及给定语言模型在训练和验证集上的困惑度来分析结果。与基准语言模型相比，相对改善了22.4％（最佳单词错误率达7.25％）。

著录项

来源
《International Conference on speech and computer》|2018年|522-531|共10页
会议地点
作者
Branislav Popovic; Edvin Pakoci; Darko Pekar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Language modeling; RNNLM; LSTM; LVCSR;

机译：语言建模; RNNLM; LSTM;轻型滑车;

相似文献

外文文献
中文文献
专利

1. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
2. A study of neural network Russian language models for automatic continuous speech recognition systems [J] . Kipyatkova I. S., Karpov A. A. Automation and Remote Control . 2017,第5期

机译：自动持续语音识别系统神经网络俄语模型的研究
3. Building Statistical Language Models for Persian Continuous Speech Recognition Systems Using the Peykare Corpus [J] . Mohammad Bahrani, Hossein Sameti International journal of computer processing of languages . 2011,第1期

机译：使用Peykare语料库为波斯语连续语音识别系统建立统计语言模型
4. A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian [C] . Branislav Popovic, Edvin Pakoci, Darko Pekar International Conference on Speech and Computer . 2018

机译：塞尔维亚持续语音识别系统中语言模型训练技术的比较
5. Discriminative training of language models for speech recognition . [D] . Magdin, Vladimir. 2010

机译：语音识别语言模型的判别训练。
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. Improving Continuous Sign Language Recognition: Speech Recognition Techniques and System Design [O] . Forster Jens, Koller Oscar, Oberdörfer Christian, 2013

机译：改进连续手语识别：语音识别技术和系统设计

A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian

摘要

著录项

相似文献

相关主题

期刊订阅