首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Integrating Gaussian mixtures into deep neural networks: Softmax layer with hidden variables
【24h】

Integrating Gaussian mixtures into deep neural networks: Softmax layer with hidden variables

机译:将高斯混合物集成到深度神经网络中:带有隐藏变量的Softmax层

获取原文

摘要

In the hybrid approach, neural network output directly serves as hidden Markov model (HMM) state posterior probability estimates. In contrast to this, in the tandem approach neural network output is used as input features to improve classic Gaussian mixture model (GMM) based emission probability estimates. This paper shows that GMM can be easily integrated into the deep neural network framework. By exploiting its equivalence with the log-linear mixture model (LMM), GMM can be transformed to a large softmax layer followed by a summation pooling layer. Theoretical and experimental results indicate that the jointly trained and optimally chosen GMM and bottleneck tandem features cannot perform worse than a hybrid model. Thus, the question “hybrid vs. tandem” simplifies to optimizing the output layer of a neural network. Speech recognition experiments are carried out on a broadcast news and conversations task using up to 12 feed-forward hidden layers with sigmoid and rectified linear unit activation functions. The evaluation of the LMM layer shows recognition gains over the classic softmax output.
机译:在混合方法中,神经网络输出直接用作隐马尔可夫模型(HMM)状态的后验概率估计。与此相反,在串联方法中,将神经网络输出用作输入功能,以改进基于经典高斯混合模型(GMM)的排放概率估计。本文表明,可以将GMM轻松集成到深度神经网络框架中。通过利用与对数线性混合模型(LMM)等效,GMM可以转换为较大的softmax层,然后转换为求和合并层。理论和实验结果表明,经过联合训练和最优选择的GMM和瓶颈串联特征不会比混合模型更差。因此,“混合与串联”问题简化为优化神经网络的输出层。语音识别实验是在广播新闻和对话任务上进行的,使用多达12个具有S型和整流线性单元激活功能的前馈隐藏层进行。对LMM层的评估表明,与传统的softmax输出相比,识别增益更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号