首页> 外文会议>International Conference on speech and computer >Deep Neural Network Based Continuous Speech Recognition for Serbian Using the Kaldi Toolkit
【24h】

Deep Neural Network Based Continuous Speech Recognition for Serbian Using the Kaldi Toolkit

机译:使用Kaldi工具包的基于深度神经网络的塞尔维亚语连续语音识别

获取原文

摘要

This paper presents a deep neural network (DNN) based large vocabulary continuous speech recognition (LVCSR) system for Serbian, developed using the open-source Kaldi speech recognition toolkit. The DNNs are initialized using stacked restricted Boltzmann machines (RBMs) and trained using cross-entropy as the objective function and the standard error backpropagation procedure in order to provide posterior probability estimates for the hidden Markov model (HMM) states. Emission densities of HMM states are represented as Gaussian mixture models (GMMs). The recipes were modified based on the particularities of the Serbian language in order to achieve the optimal results. A corpus of approximately 90 hours of speech (21000 utterances) is used for the training. The performances are compared for two different sets of utterances between the baseline GMM-HMM algorithm and various DNN settings.
机译:本文介绍了使用开源Kaldi语音识别工具包开发的基于深度神经网络(DNN)的塞尔维亚语大词汇量连续语音识别(LVCSR)系统。 DNN使用堆叠式受限Boltzmann机器(RBM)进行初始化,并使用交叉熵作为目标函数和标准误差反向传播过程进行训练,以便为隐马尔可夫模型(HMM)状态提供后验概率估计。 HMM状态的发射密度表示为高斯混合模型(GMM)。为了达到最佳效果,根据塞尔维亚语言的特殊性对配方进行了修改。大约90个小时的语音语料库(21000话语)用于训练。比较了基线GMM-HMM算法和各种DNN设置之间两组不同发音的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号