首页> 外文会议>2012 IEEE Workshop on Spoken Language Technology. >Context-dependent Deep Neural Networks for audio indexing of real-life data
【24h】

Context-dependent Deep Neural Networks for audio indexing of real-life data

机译:上下文相关的深度神经网络,用于对真实数据进行音频索引

获取原文
获取原文并翻译 | 示例

摘要

We apply Context-Dependent Deep-Neural-Network HMMs, or CD-DNN-HMMs, to the real-life problem of audio indexing of data across various sources. Recently, we had shown that on the Switchboard benchmark on speaker-independent transcription of phone calls, CD-DNN-HMMs with 7 hidden layers reduce the word error rate by as much as one-third, compared to discriminatively trained Gaussian-mixture HMMs, and by one-fourth if the GMM-HMM also uses fMPE features. This paper takes CD-DNN-HMM based recognition into a real-life deployment for audio indexing. We find that for our best speaker-independent CD-DNN-HMM, with 32k senones trained on 2000h of data, the one-fourth reduction does carry over to inhomogeneous field data (video podcasts and talks). Compared to a speaker-adaptive GMM system, the relative improvement is 18%, at very similar end-to-end runtime. In system building, we find that DNNs can benefit from a larger number of senones than the GMM-HMM; and that DNN likelihood evaluation is a sizeable runtime factor even in our wide-beam context of generating rich lattices: Cutting the model size by 60% reduces runtime by one-third at a 5% relative WER loss.
机译:我们将上下文相关的深度神经网络HMM(或CD-DNN-HMM)应用于跨各种来源的数据的音频索引的现实问题。最近,我们已经证明,在Switchboard上,与说话者无关的电话转录基准上,与经过区别训练的高斯混合HMM相比,具有7个隐藏层的CD-DNN-HMM可将字错误率降低多达三分之一,如果GMM-HMM也使用fMPE功能,则为四分之一。本文将基于CD-DNN-HMM的识别技术应用于音频索引的实际部署中。我们发现,对于我们最好的独立于扬声器的CD-DNN-HMM,在2000h的数据上训练了32,000个senones,减少的四分之一确实会延续到不均匀的现场数据(视频播客和演讲)。与支持扬声器的GMM系统相比,在非常相似的端到端运行时,相对改进为18%。在系统构建中,我们发现DNN可以比GMM-HMM受益于更多的senone。而且即使在我们生成丰富晶格的宽光束环境中,DNN可能性评估也是一个相当大的运行时因素:将模型尺寸减少60%会使运行时减少三分之一,相对WER损失为5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号