Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

Bowen Zhou; Hansen J.H.L.

首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

【24h】

Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

机译：基于特征空间映射的快速判别声学模型，用于说话人快速适应

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is widely believed that strong correlations exist across an utterance as a consequence of time-invariant characteristics of speaker and acoustic environments. It is verified in this paper that the first primary eigendirections of the utterance covariance matrix are speaker dependent. Based on this observation, a novel family of fast speaker adaptation algorithms entitled Eigenspace Mapping (EigMap) is proposed. The proposed algorithms are applied to continuous density Hidden Markov Model (HMM) based speech recognition. The EigMap algorithm rapidly constructs discriminative acoustic models in the test speaker's eigenspace by preserving discriminative information learned from baseline models in the directions of the test speaker's eigenspace. Moreover, the adapted models are compressed by discarding model parameters that are assumed to contain no discrimination information. The core idea of EigMap can be extended in many ways, and a family of algorithms based on EigMap is described in this paper. Unsupervised adaptation experiments show that EigMap is effective in improving baseline models using very limited amounts of adaptation data with superior performance to conventional adaptation techniques such as MLLR and block diagonal MLLR. A relative improvement of 18.4% over a baseline recognizer is achieved using EigMap with only about 4.5 s of adaptation data. Furthermore, it is also demonstrated that EigMap is additive to MLLR by encompassing important speaker dependent discriminative information. A significant relative improvement of 24.6% over baseline is observed using 4.5 s of adaptation data by combining MLLR and EigMap techniques.

机译：人们普遍认为，由于说话者和声学环境的时不变特性，整个话语之间存在很强的相关性。本文证明了话语协方差矩阵的第一主要特征方向是说话者相关的。基于此观察结果，提出了一种新颖的快速说话人自适应算法家族，称为本征空间映射（EigMap）。该算法被应用于基于连续密度隐马尔可夫模型（HMM）的语音识别。 EigMap算法通过保留从基线模型学到的方向说话者特征空间的信息，从而在测试说话者特征空间中快速构建判别声学模型。此外，通过丢弃假定不包含判别信息的模型参数来压缩适配的模型。 EigMap的核心思想可以通过多种方式扩展，本文介绍了一系列基于EigMap的算法。无监督的适应性实验表明，EigMap使用有限数量的适应性数据可有效改善基线模型，并且具有优于常规适应性技术（例如MLLR和块对角MLLR）的性能。使用EigMap仅使用约4.5 s的适应数据，即可相对于基线识别器实现18.4％的相对改进。此外，还证明了EigMap通过包含重要的依赖于说话者的区分性信息而成为MLLR的补充。通过将MLLR和EigMap技术结合使用4.5 s的适应数据，可以观察到相对于基线有24.6％的显着相对改善。

著录项

来源
《IEEE Transactions on Speech and Audio Proceessing》 |2005年第4期|p.554-564|共11页
作者
Bowen Zhou; Hansen J.H.L.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类电声技术和语音信号处理;
关键词
covariance matrices; eigenvalues and eigenfunctions; hidden Markov models; speech recognition; eigenspace mapping; fast speaker adaptation; hidden Markov model; rapid discriminative acoustic model; speaker dependent discriminative information; speech recognition;

机译：协方差矩阵;特征值和特征函数;隐马尔可夫模型;语音识别;特征空间映射;快速说话人自适应;隐马尔可夫模型;快速判别声学模型;与说话者相关的判别信息;语音识别;

相似文献

外文文献
中文文献
专利

1. Speaker Adaptation Based on PPCA of Acoustic Models in a Two-Way Array Representation [J] . Yongwon JEONG IEICE transactions on information and systems . 2014,第8期

机译：双向阵列中基于声学模型PPCA的说话人自适应
2. Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment [J] . Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, 電子情報通信学会技術研究報告. 音声. Speech . 2004,第542期

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应
3. Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment [J] . Randy GOMEZ, Akinobu LEE, Hiroshi SARUWATARI, 電子情報通信学会技術研究報告. 音声. Speech . 2004,第542期

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应
4. DISCRIMINATIVE ACOUSTIC MODEL USING EIGENSPACE MAPPING FOR RAPID SPEAKER ADAPTATION [C] . IEEE IEEE International Conference on Acoustics, Speech, and Signal Processing . 2003

机译：利用EIGenspace测绘的判别声学模型，用于快速扬声器适应
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Acoustic and perceptual correlates of faster-than-habitual speech produced by speakers with Parkinsons disease and Multiple Sclerosis [O] . Christina Kuo, Kris Tjaden, Joan E. Sussman -1

机译：与帕金森氏病和多发性硬化症的说话者产生的快于习惯的语音的声学和知觉相关性
7. Eigenspace-Based Maximum A Posteriori Linear Regression For Rapid Speaker Adaptation [O] . Kuan-ting Chen, Hsin-min Wang 2001

机译：基于特征空间的最大后验线性回归用于快速说话人适应

Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

摘要

著录项

相似文献

相关主题

期刊订阅