Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

Randy GOMEZ; Tomoki TODA; Hiroshi SARUWATARKiyohiro SHIKANORandy Gomez戸田智基猿渡洋鹿野浦安

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

【24h】

Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

机译：Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Speaker adaptation techniques minimize the effect of speaker variability. It is neccessary to carry out speaker adaptation rapidly using a minimum amount of adaptation data in real-time application. We propose to improve the unsupervised speaker adaptation based on HMM-Sufficient Statistics using linear interpolation. This adaptation technique uses a single arbitrary utterance to provide data for adaptation by means of selecting N-best speakers' Sufficient Statistics. Reducing the selected N-best speakers implies reduction in adaptation time. However, recognition performance is degraded clue to insufficiency of data needed to robustly adapt the model. We introduce linear interpolation of the global HMM-Sufficient Statistics to offset the negative effect of reducing N-best. We achieved a 50 reduction in adaptation time without recognition performance degradation. In our experiment, we have reduced the adaptation time from 10 sec to 5 sec without degrading the recognition performance. Furthermore we compared our method with Vocal Tract Length Normalization (VTLN), Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression. Moreover, we tested the performance of our approach in office, car, crowd and booth noise environments in 10 dB, 15 dB, 20 dB and 25 dB SNRs.

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2005年第493期|13-18|共6页
作者
Randy GOMEZ; Tomoki TODA; Hiroshi SARUWATARKiyohiro SHIKANORandy Gomez戸田智基猿渡洋鹿野浦安;
展开▼
作者单位

8916-5 Takayama-cho, Ikoma-shi, Nara;

奈良先端科学技術大学院大学情報科学研究科;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类通信;
关键词
Rapid Unsupervised Speaker Adaptation; Noise Robustness; HMM Sufficient Statistics; 高速教師なし話者適応; HMM; 十分統計量; 線形補間; 対維音性;

Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

摘要

著录项

相关主题

期刊订阅