Addressing the Data-Imbalance Problem in Kernel-based Speaker Verification via Utterance Partitioning and Speaker Comparison

机译：通过话语分割和说话人比较解决基于内核的说话人验证中的数据不平衡问题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

GMM-SVM has become a promising approach to text-independent speaker verification. However, a problematic issue of this approach is the extremely serious imbalance between the numbers of speaker-class and impostor-class utterances available for training the speaker-dependent SVMs. This data-imbalance problem can be addressed by (1) creating more speaker-class supervectors for SVM training through utterance partitioning with acoustic vector resampling (UP-AVR) and (2) avoiding the SVM training so that speaker scores are formulated as an inner product discriminant function (IPDF) between the target-speaker's supervector and test supervector. This paper highlights the differences between these two approaches and compares the effect of using different kernels - including the KL divergence kernel, GMM-UBM mean interval (GUMI) kernel and geometric-mean-comparison kernel - on their performance. Experiments on the NIST 2010 Speaker Recognition Evaluation suggest that GMM-SVM with UP-AVR is superior to speaker comparison and that the GUMI kernel is slightly better than the KL kernel in speaker comparison.

机译：GMM-SVM已成为一种独立于文本的说话人验证的有前途的方法。然而，该方法的问题是可用于训练依赖于说话者的SVM的说话者类和冒充者类话语的数量之间的极其严重的不平衡。可以通过以下方法解决此数据不平衡问题：（1）通过使用声矢量重采样（UP-AVR）的话语划分为SVM训练创建更多说话者类超向量，以及（2）避免SVM训练，以便将说话者得分制定为内部目标说话者的超向量和测试超向量之间的乘积判别函数（IPDF）。本文重点介绍了这两种方法之间的差异，并比较了使用不同内核（包括KL发散内核，GMM-UBM平均间隔（GUMI）内核和几何均值比较内核）的效果。 NIST 2010说话人识别评估的实验表明，带有UP-AVR的GMM-SVM优于说话人比较，并且GUMI内核在说话人比较中略胜于KL内核。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.2728-2731|共4页
会议地点
作者
Wei RAO; Man-Wai MAK;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
speaker verification; GMM-SVM; speaker comparison; NIST SRE; utterance partitioning; data imbalance;

机译：说话人验证; GMM-SVM;演讲者比较; NIST SRE;话语划分数据不平衡;

相似文献

外文文献
中文文献
专利

1. Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning [J] . Rao W., Mak M.-W. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第5期

机译：通过话语分割提高基于I矢量的说话人验证性能
2. Prosodic features-based speaker verification using speaker-specific-text for short utterances [J] . Jianwu Zhang, Jianchao He, Zhendong Wu, International Journal of Embedded Systems . 2017,第3期

机译：基于韵律的扬声器验证，使用扬声器特定文本进行短语
3. Improving short utterance speaker verification by combining MFCC and Entrocy in Noisy conditions [J] . Al-karawi Khamis A., Mohammed Duraid Y. Multimedia Tools and Applications . 2021,第14期

机译：通过在嘈杂的条件下结合MFCC和entocy改进短语扬声器验证
4. Short Utterance Variance Modelling and Utterance Partitioning for PLDA Speaker Verification [C] . Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Annual Conference of the International Speech Communication Association . 2016

机译：PLDA扬声器验证的短语方差建模与话语分区
5. A comparison of grammatical structures in utterances of native speakers of American English and those in EFL textbook dialogues. [D] . Yu, Je-Myoung. 1994

机译：比较以美国英语为母语的说话者和EFL教科书对话中的语法结构。
6. Short-time speaker verification with different speaking style utterances [O] . Hongwei Mao, Yan Shi, Yue Liu, 2020

机译：短时间发言者验证不同的说话风格的话语
7. Short utterance variance modelling and utterance partitioning for PLDA speaker verification [O] . Kanagasundaram Ahilan, Dean David, Sridharan Sridha, 2016

机译：用于PLDA说话人验证的简短话语方差建模和话语划分
8. Speaker Recognition from an Unknown Utterance and Speaker-Speech Interaction. [R] . Kashyap, R. L. 1976

机译：来自未知话语和说话者 - 语音交互的说话人识别。

Addressing the Data-Imbalance Problem in Kernel-based Speaker Verification via Utterance Partitioning and Speaker Comparison

摘要

著录项

相似文献

相关主题

期刊订阅