首页> 外文期刊>Computer speech and language >Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition
【24h】

Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

机译:说话人和语言识别中GMM-NAP-SVM最大后验适应的相关因子

获取原文
获取原文并翻译 | 示例
           

摘要

This paper studies the relevance factor in maximum a posteriori (MAP) adaptation of Gaussian mixture model (GMM) for speaker and language recognition. Knowing that relevance factor determines how much the observed training data influence the model adaptation, thus the resulting GMM model, it is believed that more effective modeling can be achieved if the relevance factor is adaptive to the corresponding data. We therefore provide a mathematic derivation for the estimation of relevance factor. GMM supervector support vector machine (SVM) with nuisance attribute projection (NAP) (GMM-NAP-SVM) has been reported to be effective and reliable for speaker and language recognition. Being a discriminative classifier in nature, a GMM-NAP-SVM system is sensitive to the magnitude and direction of a supervector in the high dimensional space. However, when characterizing a speech utterance with GMM supervector estimated through MAP, we observe that the resulting supervector is undesirably affected by the varying duration of the utterance. We propose an adaptive relevance factor that adapts to the duration to mitigate the variability effect due to the length of utterance. We give a systematic investigation on different types of relevance factor of MAP in different applicatively platforms. We show the efficacy of the data-dependent as well as adaptive relevance factors on the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2008 and language recognition evaluation (LRE) 2009 and 2011 tasks respectively.
机译:本文研究了高斯混合模型(GMM)用于说话人和语言识别的最大后验(MAP)适应性的相关因素。知道相关因素决定了观察到的训练数据在多大程度上影响了模型的适应性,从而影响了最终的GMM模型,相信如果相关因素适应于相应的数据,则可以实现更有效的建模。因此,我们为相关因子的估计提供了数学推导。据报道,带有干扰属性投影(NAP)的GMM超向量支持向量机(SVM)(GMM-NAP-SVM)对于说话人和语言识别是有效且可靠的。 GMM-NAP-SVM系统本质上是一种判别式分类器,它对高维空间中超向量的大小和方向很敏感。但是,当使用通过MAP估计的GMM超向量来表征语音发声时,我们观察到,所产生的超向量会受到发声持续时间变化的不利影响。我们提出了一种自适应相关因子,该因子适合于持续时间,以减轻由于话语长度而引起的可变性影响。我们对不同应用平台中MAP的不同相关因子类型进行了系统的研究。我们分别在国家标准与技术研究院(NIST)说话者识别评估(SRE)2008和语言识别评估(LRE)2009和2011任务中显示了依赖数据和自适应相关因子的功效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号