【24h】

SpeeD's DNN approach to Romanian speech recognition

机译:速度的DNN探讨罗马尼亚语音识别

获取原文

摘要

This paper presents the main improvements brought recently to the large-vocabulary, continuous speech recognition (LVCSR) system for Romanian language developed by the Speech and Dialogue (SpeeD) research laboratory. While the most important improvement consists in the use of DNN-based acoustic models, instead of the classic HMM-GMM approach, several other aspects are discussed in the paper: a significant increase of the speech training corpus, the use of additional algorithms for feature processing, speaker adaptive training, and discriminative training and, finally, the use of lattice rescoring with significantly expanded language models (n-gram models up to order 5, based on vocabularies of up to 200k words). The ASR experiments were performed with several types of acoustic and language models in different configurations on the standard read and conversational speech corpora created by SpeeD in 2014. The results show that the extension of the training speech corpus leads to a relative word error rate (WER) improvement between 15% and 17%, while the use of DNN-based acoustic models instead of HMM-GMM-based acoustic models leads to a relative WER improvement between 18% and 23%, depending on the nature of the evaluation speech corpus (read or conversational, clean or noisy). The best configuration of the LVCSR system was integrated as a live transcription web application available online on SpeeD laboratory's website at https://speed.pub.ro/live-transcriber-2017.
机译:本文礼物最近带到大词汇的主要改进,罗马尼亚语言连续语音识别(LVCSR)系统由语音和对话(速度)的研究实验室开发的。而最重要的改进之处在于代替经典HMM-GMM方法在使用基于DNN声学模型的,其他几个方面将在本文讨论:语音训练语料的显著增加,使用的功能的其他算法处理,扬声器适应性训练和判别训练,最后,使用晶格再评分与显著扩展的语言模型(n-gram中的模型到顺序5,根据最多的词汇至200K个字)。该ASR实验是用几种类型由速度在2014年创建的标准读取和对话语音语料库不同配置的声学和语言模型的结果显示执行的训练语料库导致相对字错误率的延伸(WER 15 %和17 %之间)的改善,而使用基于DNN声学模型代替的HMM-GMM基于声学模型导致之间18 %和23 %的相对WER改进,视的性质评价语料库(读或对话,清洁或嘈杂)。该LVCSR系统的最佳配置是在线集成为一个提供实时转录Web应用程序的速度实验室在https://speed.pub.ro/live-transcriber-2017网站。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号