...
首页> 外文期刊>Computer speech and language >Building DNN acoustic models for large vocabulary speech recognition
【24h】

Building DNN acoustic models for large vocabulary speech recognition

机译:建立用于大词汇量语音识别的DNN声学模型

获取原文
获取原文并翻译 | 示例
           

摘要

Understanding architectural choices for deep neural networks (DNNs) is crucial to improving state-of-the-art speech recognition systems. We investigate which aspects of DNN acoustic model design are most important for speech recognition system performance, focusing on feed-forward networks. We study the effects of parameters like model size (number of layers, total parameters), architecture (convolutional networks), and training details (loss function, regularization methods) on DNN classifier performance and speech recognizer word error rates. On the Switchboard benchmark corpus we compare standard DNNs to convolutional networks, and present the first experiments using locally-connected, untied neural networks for acoustic modeling. Using a much larger 2100-hour training corpus (combining Switchboard and Fisher) we examine the performance of very large DNN models - with up to ten times more parameters than those typically used in speech recognition systems. The results suggest that a relatively simple DNN architecture and optimization technique give strong performance, and we offer intuitions about architectural choices like network depth over breadth. Our findings extend previous works to help establish a set of best practices for building DNN hybrid speech recognition systems and constitute an important first step toward analyzing more complex recurrent, sequence-discriminative, and HMM-free architectures.
机译:了解深度神经网络(DNN)的体系结构选择对于改进最新的语音识别系统至关重要。我们重点研究前馈网络,研究DNN声学模型设计的哪些方面对于语音识别系统的性能最重要。我们研究了模型大小(层数,总参数),体系结构(卷积网络)和训练细节(损失函数,正则化方法)等参数对DNN分类器性能和语音识别器单词错误率的影响。在Switchboard基准语料库上,我们将标准DNN与卷积网络进行了比较,并提出了使用本地连接的,非捆绑式神经网络进行声学建模的第一个实验。我们使用更大的2100小时训练语料库(结合了Switchboard和Fisher),检查了非常大的DNN模型的性能-参数比语音识别系统中通常使用的参数多十倍。结果表明,相对简单的DNN架构和优化技术可提供强大的性能,并且我们提供了有关架构选择(如网络深度超过广度)的直觉。我们的发现扩展了以前的工作,以帮助建立一套用于构建DNN混合语音识别系统的最佳实践,并构成了分析更复杂的循环,区分序列和无HMM架构的重要第一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号