Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition

Xue Shaofei; Jiang Hui; Dai Lirong; Liu Qingfeng

首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition

【24h】

Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition

机译：基于奇异值分解的混合NN / HMM模型语音识别的说话人自适应

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently several speaker adaptation methods have been proposed for deep neural network (DNN) in many large vocabulary continuous speech recognition (LVCSR) tasks. However, only a few methods rely on tuning the connection weights in trained DNNs directly to optimize system performance since it is very prone to over-fitting especially when some class labels are missing in the adaptation data. In this paper, we propose a new speaker adaptation method for the hybrid NN/HMM speech recognition model based on singular value decomposition (SVD). We apply SVD on the weight matrices in trained DNNs and then tune rectangular diagonal matrices with the adaptation data. This alleviates the over-fitting problem via updating the weight matrices slightly by only modifying the singular values. We evaluate the proposed adaptation method in two standard speech recognition tasks, namely TIMIT phone recognition and large vocabulary speech recognition in the Switchboard task. Experimental results have shown that it is effective to adapt large DNN models using only a small amount of adaptation data. For example, recognition results in the Switchboard task have shown that the proposed SVD-based adaptation method may achieve up to 3-6 % relative error reduction using only a few dozens of adaptation utterances per speaker.

机译：最近，在许多大型词汇连续语音识别（LVCSR）任务中，已针对深度神经网络（DNN）提出了几种说话人自适应方法。但是，只有少数几种方法直接依靠调整训练后的DNN中的连接权重来优化系统性能，因为它非常容易过度拟合，尤其是在适应数据中缺少某些类别标签时。本文提出了一种基于奇异值分解（SVD）的混合NN / HMM语音识别模型说话人自适应新方法。我们将SVD应用于经过训练的DNN中的权重矩阵，然后使用自适应数据调整矩形对角矩阵。通过仅修改奇异值来稍微更新权重矩阵，从而缓解了过度拟合的问题。我们在两种标准语音识别任务（即TIMIT电话识别和Switchboard任务中的大词汇量语音识别）中评估了提出的自适应方法。实验结果表明，仅使用少量适应数据来适应大型DNN模型是有效的。例如，总机任务中的识别结果表明，提出的基于SVD的自适应方法可以使用每个说话者仅使用几十个自适应话语就可以实现高达3-6％的相对误差减少。

著录项

来源
《Journal of signal processing systems for signal, image, and video technology》 |2016年第2期|175-185|共11页
作者
Xue Shaofei; Jiang Hui; Dai Lirong; Liu Qingfeng;
展开▼
作者单位

Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230026, Peoples R China;

York Univ, Dept Elect Engn & Comp Sci, Toronto, ON M3J 2R7, Canada;

Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230026, Peoples R China;

Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230026, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep neural network (DNN); Hybrid DNN/HMM; Speaker adaptation; Singular value decomposition (SVD);

机译：深度神经网络（DNN）;混合DNN / HMM;扬声器自适应;奇异值分解（SVD）;

相似文献

外文文献
中文文献
专利

1. Hybrid NN/HMM acoustic modeling techniques for distributed speech recognition [J] . Jan Stadermann, Gerhard Rigoll Speech Communication . 2006,第8期

机译：混合NN / HMM声学建模技术用于分布式语音识别
2. Hybrid HMM-NN modeling of stationary-transitional units for continuous speech recognition [J] . Albesano D., Mana F., Gemello R. Information Sciences: An International Journal . 2000,第1a2期

机译：平稳过渡单元的混合HMM-NN建模，用于连续语音识别
3. Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis [J] . John Dines, Hui Liang, Lakshmi Saheer, Computer speech and language . 2013,第2期

机译：个性化语音到语音翻译：基于HMM的语音合成的无监督跨语言说话者自适应
4. Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition [C] . Xue Shaofei, Jiang Hui, Dai Lirong International Symposium on Chinese Spoken Language Processing . 2014

机译：基于奇异值分解的混合NN / HMM模型用于语音识别的说话人自适应
5. Model selection based speaker adaptation and its application to nonnative speech recognition. [D] . He, Xiaodong. 2003

机译：基于模型选择的说话人自适应及其在非本地语音识别中的应用。
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. Model Adaptation based on HMM decomposition for Reverberant Speech Recognition [O] . Tetsuya Takiguchi, Satoshi Nakamura, Qiang Huo, 1997

机译：基于HMM分解的模型自适应混响语音识别。

Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅