首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Factorized Hidden Layer Adaptation for Deep Neural Network Based Acoustic Modeling
【24h】

Factorized Hidden Layer Adaptation for Deep Neural Network Based Acoustic Modeling

机译:基于深度神经网络的声学建模的分解隐藏层自适应

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose the factorized hidden layer (FHL) approach to adapt the deep neural network (DNN) acoustic models for automatic speech recognition (ASR). FHL aims at modeling speaker dependent (SD) hidden layers by representing an SD affine transformation as a linear combination of bases. The combination weights are low-dimensional speaker parameters that can be initialized using speaker representations like i-vectors and then reliably refined in an unsupervised adaptation fashion. Therefore, our method provides an efficient way to perform both adaptive training and (test-time) adaptation. Experimental results have shown that the FHL adaptation improves the ASR performance significantly, compared to the standard DNN models, as well as other state-of-the-art DNN adaptation approaches, such as training with the speaker-normalized CMLLR features, speaker-aware training using i-vector and learning hidden unit contributions (LHUC). For Aurora 4, FHL achieves 3.8% and 2.3% absolute improvements over the standard DNNs trained on the LDA + STC and CMLLR features, respectively. It also achieves 1.7% absolute performance improvement over a system that combines the i-vector adaptive training with LHUC adaptation. For the AMI dataset, FHL achieved 1.4% and 1.9% absolute improvements over the sequence-trained CMLLR baseline systems, for the IHM and SDM tasks, respectively.
机译:在本文中,我们提出了因式分解隐藏层(FHL)方法,以将深度神经网络(DNN)声学模型改编为自动语音识别(ASR)。 FHL旨在通过将SD仿射变换表示为碱基的线性组合来建模说话者相关(SD)的隐藏层。组合权重是低维扬声器参数,可以使用诸如i-vector之类的扬声器表示进行初始化,然后以无人监督的自适应方式可靠地进行优化。因此,我们的方法提供了执行自适应训练和(测试时间)自适应的有效方法。实验结果表明,与标准DNN模型以及其他最新的DNN适应方法相比,FHL适应可显着改善ASR性能,例如使用说话人归一化CMLLR功能进行训练,了解说话者使用i-vector进行训练并学习隐藏的单位贡献(LHUC)。对于Aurora 4,与在LDA + STC和CMLLR功能上训练的标准DNN相比,FHL的绝对改进分别达到了3.8%和2.3%。在将i矢量自适应训练与LHUC自适应相结合的系统上,它也实现了1.7%的绝对性能提升。对于AMI数据集,对于IHM和SDM任务,FHL分别比序列训练的CMLLR基线系统提高了1.4%和1.9%的绝对改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号