...
首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics
【24h】

Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

机译:Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics

获取原文
获取原文并翻译 | 示例
           

摘要

Speaker adaptation techniques minimize the effect of speaker variability. It is neccessary to carry out speaker adaptation rapidly using a minimum amount of adaptation data in real-time application. We propose to improve the unsupervised speaker adaptation based on HMM-Sufficient Statistics using linear interpolation. This adaptation technique uses a single arbitrary utterance to provide data for adaptation by means of selecting N-best speakers' Sufficient Statistics. Reducing the selected N-best speakers implies reduction in adaptation time. However, recognition performance is degraded clue to insufficiency of data needed to robustly adapt the model. We introduce linear interpolation of the global HMM-Sufficient Statistics to offset the negative effect of reducing N-best. We achieved a 50 reduction in adaptation time without recognition performance degradation. In our experiment, we have reduced the adaptation time from 10 sec to 5 sec without degrading the recognition performance. Furthermore we compared our method with Vocal Tract Length Normalization (VTLN), Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression. Moreover, we tested the performance of our approach in office, car, crowd and booth noise environments in 10 dB, 15 dB, 20 dB and 25 dB SNRs.
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号