Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment

Randy GOMEZ; Akinobu LEE; Hiroshi SARUWATARI; Kiyohiro SHIKANO

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment

【24h】

Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker adaptation in speech recognition is necessary to achieve a high accuracy for wide varieties of speakers. On the other hand, using class-dependent (CD) acoustic model for specific gender/age class can result to a better accuracy than a single speaker-independent (SI) model. In this research, we extend the unsupervised speaker adaptation based on HMM Sufficient Statistics (HMM-SS) for multiple database and multiple initial models, given a wide varieties of speech database. As opposed to the conventional approach which utilizes only a single SI model as a base model, the proposed method makes use of multiple CD models to push up the performance of initial model before adaptation. A speaker's class is estimated from the N-best neighbor speakers by Gaussian Mixture Models (GMM) on the way of speaker selection, and the corresponding CD model is adopted as a base model. Then, the unsupervised speaker adaptation is performed by constructing HMM from HMM-SS of the selected speakers. Experiments were carried out on two database namely, adults and senior people by JNAS, and we performed testing under noisy environment conditions such as office, crowd, booth and car noise with 20dB SNR. Recognition results show that the proposed method based on multiple model outperforms the conventional approach. Moreover, comparison with the Maximum Likelihood Linear Regression (MLLR) adaptation with 10 supervised utterance confirms that our method perfroms better with only a single utterance input.

机译：语音识别中的说话人自适应是实现多种说话人的高精度所必需的。另一方面，对于特定的性别/年龄类别，使用与类别相关的（CD）声学模型可以比单个与说话者无关的（SI）模型获得更高的准确性。在这项研究中，我们在给定语音数据库种类繁多的情况下，将基于HMM足够统计量（HMM-SS）的无监督说话人适应性扩展到多个数据库和多个初始模型。与仅使用单个SI模型作为基础模型的常规方法相反，该方法利用多个CD模型来提高适应之前的初始模型的性能。通过高斯混合模型（GMM）从说话者选择的N个最佳邻居说话者中估计说话者的类别，并采用相应的CD模型作为基础模型。然后，通过从所选说话者的HMM-SS构造HMM来执行无监督说话者自适应。 JNAS在成年人和老年人这两个数据库上进行了实验，我们在嘈杂的环境条件下进行了测试，例如办公室，人群，展位和20dB SNR的汽车噪音。识别结果表明，该基于多模型的方法优于传统方法。此外，与具有10个受监督话语的最大似然线性回归（MLLR）适应性进行比较，证实了我们的方法仅使用单个话语输入即可表现更好。

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2004年第539期|共6页
作者
Randy GOMEZ; Akinobu LEE; Hiroshi SARUWATARI; Kiyohiro SHIKANO;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类通信;
关键词
Unsupervised Adaptation; Noise Robustness; HMM Sufficient Statistics;

机译：无监督自适应噪声鲁棒HMM足够的统计;

相似文献

外文文献
中文文献
专利

1. Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment [J] . Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, 電子情報通信学会技術研究報告. 音声. Speech . 2004,第542期

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应
2. Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment [J] . Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2004,第539期

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应
3. Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment [J] . Randy GOMEZ, Akinobu LEE, Hiroshi SARUWATARI, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2004,第539期

机译：噪声环境下基于HMM充分统计的多种声学模型的无监督说话人自适应
4. UNSUPERVISED SPEAKER ADAPTATION BASED ON HMM SUFFICIENT STATISTICS IN VARIOUS NOISY ENVIRONMENTS [C] . Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, European Conference on Speech Communication and Technology . 2003

机译：无监督的扬声器适应各种嘈杂环境中的HMM充分统计
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. Rapid Unsupervised Speaker Adaptation Based on Multi-Template HMM Sufficient Statistics in Noisy Environments [O] . Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, 2005

机译：嘈杂环境中基于多模板HMM充分统计的快速无监督说话人自适应

Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment

摘要

著录项

相似文献

相关主题

期刊订阅