Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

Chang Huai You; Haizhou Li; Kong Aik Lee

首页> 外文期刊>Computer speech and language >Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

【24h】

Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

机译：说话人和语言识别中GMM-NAP-SVM最大后验适应的相关因子

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper studies the relevance factor in maximum a posteriori (MAP) adaptation of Gaussian mixture model (GMM) for speaker and language recognition. Knowing that relevance factor determines how much the observed training data influence the model adaptation, thus the resulting GMM model, it is believed that more effective modeling can be achieved if the relevance factor is adaptive to the corresponding data. We therefore provide a mathematic derivation for the estimation of relevance factor. GMM supervector support vector machine (SVM) with nuisance attribute projection (NAP) (GMM-NAP-SVM) has been reported to be effective and reliable for speaker and language recognition. Being a discriminative classifier in nature, a GMM-NAP-SVM system is sensitive to the magnitude and direction of a supervector in the high dimensional space. However, when characterizing a speech utterance with GMM supervector estimated through MAP, we observe that the resulting supervector is undesirably affected by the varying duration of the utterance. We propose an adaptive relevance factor that adapts to the duration to mitigate the variability effect due to the length of utterance. We give a systematic investigation on different types of relevance factor of MAP in different applicatively platforms. We show the efficacy of the data-dependent as well as adaptive relevance factors on the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2008 and language recognition evaluation (LRE) 2009 and 2011 tasks respectively.

机译：本文研究了高斯混合模型（GMM）用于说话人和语言识别的最大后验（MAP）适应性的相关因素。知道相关因素决定了观察到的训练数据在多大程度上影响了模型的适应性，从而影响了最终的GMM模型，相信如果相关因素适应于相应的数据，则可以实现更有效的建模。因此，我们为相关因子的估计提供了数学推导。据报道，带有干扰属性投影（NAP）的GMM超向量支持向量机（SVM）（GMM-NAP-SVM）对于说话人和语言识别是有效且可靠的。 GMM-NAP-SVM系统本质上是一种判别式分类器，它对高维空间中超向量的大小和方向很敏感。但是，当使用通过MAP估计的GMM超向量来表征语音发声时，我们观察到，所产生的超向量会受到发声持续时间变化的不利影响。我们提出了一种自适应相关因子，该因子适合于持续时间，以减轻由于话语长度而引起的可变性影响。我们对不同应用平台中MAP的不同相关因子类型进行了系统的研究。我们分别在国家标准与技术研究院（NIST）说话者识别评估（SRE）2008和语言识别评估（LRE）2009和2011任务中显示了依赖数据和自适应相关因子的功效。

著录项

来源
《Computer speech and language》 |2015年第1期|116-134|共19页
作者
Chang Huai You; Haizhou Li; Kong Aik Lee;
展开▼
作者单位

Institute for Infocomm Research, A~*STAR, Singapore;

Institute for Infocomm Research, A~*STAR, Singapore;

Institute for Infocomm Research, A~*STAR, Singapore;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Maximum a posteriori; Supervector; Gaussian mixture model; Support vector machine;

机译：最大后验;超向量;高斯混合模型;支持向量机;

相似文献

外文文献
中文文献
专利

1. Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation [J] . Huang Zhen, Siniscalchi Sabato Marco, Lee Chin-Hui Pattern recognition letters . 2017,第octa15期

机译：基于深度神经网络的语音识别和说话人自适应的插件最大后验解码器的分层贝叶斯组合
2. Speaker adaptation in the maximum a posteriori framework based on the probabilistic 2-mode analysis of training models [J] . Yongwon Jeong EURASIP journal on audio, speech, and music processing . 2013,第1期

机译：基于训练模型的概率2模式分析，在最大后验框架中进行说话人适应
3. Maximum a Posteriori Adaptation of the Centroid Model for Speaker Verification [J] . Hautamki V., Kinnunen T., Krkkinen I., IEEE signal processing letters . 2008,第1期

机译：用于说话人验证的质心模型的最大后验适应
4. Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition [C] . Chang Huai You, Haizhou Li, Bin Ma, Annual conference of the International Speech Communication Association . 2012

机译：GMM-SVM最大后验适应度的相关因子在说话人和语言识别中的作用
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Translation adaptation and validation of two versions of the Chronic Liver Disease Questionnaire in Malaysian patients for speakers of both English and Malay languages: a cross-sectional study [O] . Shasha Khairullah, Sanjiv Mahadeva 2017

机译：横断面研究在马来西亚患者中翻译改编和验证了两种版本的马来西亚慢性肝病问卷适用于讲英语和马来语的人
7. Maximum likelihood and maximum a posteriori adaptation for distributed speaker recognition systems [O] . Sit CH, Mak MW, Kung SY 2004

机译：分布式说话人识别系统的最大可能性和最大后验适应
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅