PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

机译：PLDA用于扬声器验证，具有任意持续时间的话语

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).

机译：语音段的持续时间传统上被控制在NIST扬声器识别评估中，以便在本框架中工作的研究人员已经解除了处理实际应用中出现的持续时间变异性的责任。语音话语的固定尺寸I形式矢量表示是在这种受控条件下工作的理想选择，并且忽略从短话道中提取的I载体的事实不如从长型话道提取的那些导致扬声器识别问题的非常简单的制定。然而，似乎需要更现实的方法来处理持续时间变化。在本文中，我们展示了如何量化与I - 矢量提取过程相关的不确定性，并将其传播到PLDA分类器中。我们使用从NIST 2010核心和扩展核心条件中的测试集进行了评估了这种方法，通过随机截断了女性，电话语音试验，使得所有入学和测试话语的持续时间在3-60秒的范围内，我们发现它的准确性导致了大量的改善。虽然扬声器验证的似然比计算比标准I形载体/ PLDA分类器更昂贵，但它仍然非常适度，因为它减少了计算两个完整协方差高斯的概率密度函数（无论数字的数量不论数量用来注册扬声器的话语）。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2013年||共5页
会议地点
作者
Patrick Kenny; Themos Stafylakis; Pierre Ouellet; Jahangir Alam; Pierre Dumouchel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. Sentence-HMM state-based i-vector/PLDA modelling for improved performance in text dependent single utterance speaker verification [J] . Osman Büyük Signal Processing, IET . 2016,第8期

机译：基于Sentence-HMM状态的i-vector / PLDA建模可提高与文本相关的单个说话者说话人验证的性能
2. Nonparametrically trained PLDA for short duration i-vector speaker verification [J] . Abbas Khosravani, Mohammad M. Homayounpour Computer speech and language . 2018,第NOVa期

机译：非参数训练的PLDA，用于短时i向量说话者验证
3. Speaker-Phrase-Specific Adaptation of PLDA Model for Improved Performance in Text-Dependent Speaker Verification [J] . Laskar Mohammad Azharuddin, Bhanja Chuya China, Laskar Rabul Hussain Circuits, systems and signal processing . 2021,第10期

机译：PLDA模型的扬声器 - 短语特定调整，提高文本依赖扬声器验证中的性能
4. PLDA for speaker verification with utterances of arbitrary duration [C] . Kenny Patrick, Stafylakis Themos, Ouellet Pierre, IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：PLDA用于说话人验证，持续时间任意
5. African American English Speakers' Production Demands in Spontaneous Utterances [D] . Mayanja, Seara. 2019

机译：非洲裔美国英语演讲者的生产需求在自发的话语中
6. Short-time speaker verification with different speaking style utterances [O] . Hongwei Mao, Yan Shi, Yue Liu, 2020

机译：短时间发言者验证不同的说话风格的话语
7. Short utterance variance modelling and utterance partitioning for PLDA speaker verification [O] . Kanagasundaram Ahilan, Dean David, Sridharan Sridha, 2016

机译：用于PLDA说话人验证的简短话语方差建模和话语划分
8. Speaker Recognition from an Unknown Utterance and Speaker-Speech Interaction. [R] . Kashyap, R. L. 1976

机译：来自未知话语和说话者 - 语音交互的说话人识别。

PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

摘要

著录项

相似文献

相关主题

期刊订阅