Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification

机译：基于声带长度和MLLR矩阵的说话人聚类UBM用于说话人验证的研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is common to use a single speaker independent large Gaussian Mixture Model based Universal Background Model (GMM-UBM) as the alternative hypothesis for speaker verification tasks. The speaker models are themselves derived from the UBM using Maximum a Posteriori (MAP) adaptation technique. During verification, log likelihood ratio is calculated between the target model and the GMM-UBM to accept or reject the claimant. The use of a single UBM for different groups of population may not be appropriate especially when the impostors are close to the target speaker. In this paper, we investigate the use of Speaker Cluster-wise UBM (SC-UBM) for a group of target speakers based on two different similarity measures. In the first approach, speakers are grouped into different clusters depending on their Vocal Tract Lengths (VTLs). The group of speakers having same VTL parameter indicates similarity in vocal-tract geometry and constitutes a speaker-dependent characteristic. In the second approach, we use Maximum Likelihood Linear Regression (MLLR) matrices of target speakers to create MLLR super-vectors and use them to cluster speakers into different groups. The SC-UBMs are derived from GMM-UBM using MLLR adaptation using data from the corresponding group of target speakers. Finally, speaker dependent models are adapted from their respective SC-UBM using MAP. In the proposed method, log likelihood ratio is calculated between target model and its corresponding SC-UBM. We compare performance of the above method with the single UBM method for varying number of clusters. The experiments are performed on the NIST 2004 SRE core condition and we show that the proposed method with a slight increase in the number of UBMs always outperforms the conventional single GMM-UBM system.

机译：通常使用单个与说话者无关的大型高斯混合模型基础通用背景模型（GMM-UBM）作为说话者验证任务的替代假设。扬声器模型本身是使用最大后验（MAP）自适应技术从UBM派生的。在验证期间，将计算目标模型与GMM-UBM之间的对数似然比，以接受或拒绝索赔人。对于不同的人群使用单个UBM可能不合适，尤其是当冒名顶替者靠近目标人群时。在本文中，我们基于两种不同的相似性度量，研究了针对一组目标说话者的说话者聚类UBM（SC-UBM）。在第一种方法中，根据说话人的声带长度（VTL）将说话者分为不同的组。具有相同VTL参数的一组扬声器表示声道几何形状的相似性，并构成了与扬声器相关的特性。在第二种方法中，我们使用目标说话人的最大似然线性回归（MLLR）矩阵来创建MLLR超向量，并使用它们将说话人聚类为不同的组。 SC-UBM是使用来自相应目标扬声器组的数据通过MLLR适配从GMM-UBM派生而来的。最后，使用MAP，从依赖于说话者的模型从其各自的SC-UBM进行改编。在提出的方法中，计算目标模型与其对应的SC-UBM之间的对数似然比。我们将上述方法的性能与单个UBM方法针对不同数量的簇的性能进行了比较。在NIST 2004 SRE核心条件下进行了实验，结果表明，所提出的方法在UBM数量略有增加的情况下始终优于传统的单个GMM-UBM系统。

著录项

来源
《Odyssey 2010: the speaker and language recognition workshop》|2010年|p.71-78|共8页
会议地点 Brno(CS)
作者
A. K. Sarkar; S. Umesh;
展开▼
作者单位

Department of Electrical Engineering Indian Institute of Technology Madras, India;

Department of Electrical Engineering Indian Institute of Technology Madras, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类语音信号处理;
关键词

相似文献

外文文献
中文文献
专利

1. Multiple background models for speaker verification using the concept of vocal tract length and MLLR super-vector [J] . A.K. Sarkar, S. Umesh International journal of speech technology . 2012,第3期

机译：使用声道长度和MLLR超向量的概念进行说话人验证的多个背景模型
2. Evaluation of the Vocal Tract Length Normalization Based Classifiers for Speaker Verification [J] . Walid Hussein, Sarah Akram Essmat, Nestor Yoma, International Journal of Recent Contributions from Engineering, Science & IT . 2016,第4期

机译：用于说话人验证的基于人行道长度归一化分类器的评估
3. Speaker independent speech recognition using speaker clustering based on vocal tract length [J] . Yoichiro Yahata, Kouichi Yamaguchi 電子情報通信学会技術研究報告. 音声. Speech . 2000,第595期

机译：使用基于声道长度的说话人聚类的说话人独立语音识别
4. Vocal Tract Length Normalization factor based speaker-cluster UBM for speaker verification [C] . Sarkar A.K., Rath S.P., Umesh S. Communications (NCC), 2010 . 2010

机译：基于人声道长度归一化因子的说话人-集群UBM，用于说话人验证
5. Efficient methods for rapid UBM training (RUT) for robust speaker verification. [D] . Chandrasekaran, Aravind. 2008

机译：快速的UBM训练（RUT）的有效方法，用于可靠的说话人验证。
6. The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size sex and age [O] . David R. R. Smith, Roy D. Patterson -1

机译：说话人大小性别和年龄的判断中声门脉搏率与声道长度的相互作用
7. Multi-Class UBM-Based MLLR m-Vector System for Speaker Verification [O] . Barras Claude, Sarkar Achintya 2013

机译：基于多类UBM的MLLR m矢量系统用于说话人验证
8. New Kernel for SVM MLLR Based Speaker Recognition. [R] . Karam, Z. N., Campbell, W. M. 2016

机译：基于sVm mLLR的说话人识别新核。

Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification

摘要

著录项

相似文献

相关主题

期刊订阅