首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Addressing the Data-Imbalance Problem in Kernel-based Speaker Verification via Utterance Partitioning and Speaker Comparison
【24h】

Addressing the Data-Imbalance Problem in Kernel-based Speaker Verification via Utterance Partitioning and Speaker Comparison

机译:通过话语分割和说话人比较解决基于内核的说话人验证中的数据不平衡问题

获取原文

摘要

GMM-SVM has become a promising approach to text-independent speaker verification. However, a problematic issue of this approach is the extremely serious imbalance between the numbers of speaker-class and impostor-class utterances available for training the speaker-dependent SVMs. This data-imbalance problem can be addressed by (1) creating more speaker-class supervectors for SVM training through utterance partitioning with acoustic vector resampling (UP-AVR) and (2) avoiding the SVM training so that speaker scores are formulated as an inner product discriminant function (IPDF) between the target-speaker's supervector and test supervector. This paper highlights the differences between these two approaches and compares the effect of using different kernels - including the KL divergence kernel, GMM-UBM mean interval (GUMI) kernel and geometric-mean-comparison kernel - on their performance. Experiments on the NIST 2010 Speaker Recognition Evaluation suggest that GMM-SVM with UP-AVR is superior to speaker comparison and that the GUMI kernel is slightly better than the KL kernel in speaker comparison.
机译:GMM-SVM已成为一种独立于文本的说话人验证的有前途的方法。然而,该方法的问题是可用于训练依赖于说话者的SVM的说话者类和冒充者类话语的数量之间的极其严重的不平衡。可以通过以下方法解决此数据不平衡问题:(1)通过使用声矢量重采样(UP-AVR)的话语划分为SVM训练创建更多说话者类超向量,以及(2)避免SVM训练,以便将说话者得分制定为内部目标说话者的超向量和测试超向量之间的乘积判别函数(IPDF)。本文重点介绍了这两种方法之间的差异,并比较了使用不同内核(包括KL发散内核,GMM-UBM平均间隔(GUMI)内核和几何均值比较内核)的效果。 NIST 2010说话人识别评估的实验表明,带有UP-AVR的GMM-SVM优于说话人比较,并且GUMI内核在说话人比较中略胜于KL内核。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号