...
首页> 外文期刊>International journal of speech technology >Sub-vector based biometric speaker verification using MLLR super-vector
【24h】

Sub-vector based biometric speaker verification using MLLR super-vector

机译:使用MLLR超向量的基于子向量的生物特征说话人验证

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose a sub-vector based speaker characterization method for biometric speaker verification, where speakers are represented by uniform segmentation of their maximum likelihood linear regression (MLLR) super-vectors called m-vectors. The MLLR transformation is estimated with respect to universal background model (UBM) without any speech/phonetic information. We introduce two strategies for segmentation of MLLR super-vector: one is called disjoint and other is an overlapped window technique. During test phase, m-vectors of the test utterance are scored against the claimant speaker. Before scoring, m-vectors are post-processed to compensate the session variability. In addition, we propose a clustering algorithm for multiple-class wise MLLR transformation, where Gaussian components of the UBM are clustered into different groups using the concept of expectation maximization (EM) and maximum likelihood (ML). In this case, MLLR transformations are estimated with respect to each class using the sufficient statistics accumulated from the Gaussian components belonging to the particular class, which are then used for m-vector system. The proposed method needs only once alignment of the data with respect to the UBM for multiple MLLR transformations. We first show that the proposed multi-class m-vector system shows promising speaker verification performance when compared to the conventional i-vector based speaker verification system. Secondly, the proposed EM based clustering technique is robust to the random initialization in-contrast to the conventional K-means algorithm and yields system performance better/equal which is best obtained by the K-means. Finally, we show that the fusion of the m-vector with the i-vector further improves the performance of the speaker verification in both score as well as feature domain. The experimental results are shown on various tasks of NIST 2008 speaker recognition evaluation (SRE) core condition.
机译:在本文中,我们提出了一种基于子矢量的说话人表征方法,用于生物特征说话人验证,其中说话人由其最大似然线性回归(MLLR)超级矢量(称为m矢量)的均匀分段表示。相对于没有任何语音/语音信息的通用背景模型(UBM)估计MLLR变换。我们介绍了MLLR超向量分割的两种策略:一种称为不相交,另一种是重叠窗口技术。在测试阶段,针对发声者对测试话语的m个矢量进行评分。在评分之前,对m个向量进行后处理以补偿会话的可变性。此外,我们提出了一种用于多类明智MLLR变换的聚类算法,其中使用期望最大化(EM)和最大似然(ML)的概念将UBM的高斯分量聚类为不同的组。在这种情况下,使用从属于特定类别的高斯分量中累积的足够统计信息,针对每个类别对MLLR变换进行估算,然后将其用于m矢量系统。对于多个MLLR转换,建议的方法仅需要将数据相对于UBM对齐一次。我们首先显示,与基于常规i矢量的说话人验证系统相比,拟议的多类m矢量系统显示出有希望的说话人验证性能。其次,与传统的K-means算法相比,基于EM的聚类技术对随机初始化具有较强的鲁棒性,并且可以使系统性能更好/相等,这是通过K-means可获得的最佳性能。最后,我们证明了m-vector与i-vector的融合进一步提高了说话人验证的得分和特征域性能。在NIST 2008说话者识别评估(SRE)核心条件的各种任务上显示了实验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号