...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >An improved i-vector extraction algorithm for speaker verification
【24h】

An improved i-vector extraction algorithm for speaker verification

机译:用于说话人验证的改进的i-vector提取算法

获取原文

摘要

Over recent years, i-vector-based framework has been proven to provide state-of-the-art performance in speaker verification. Each utterance is projected onto a total factor space and is represented by a low-dimensional feature vector. Channel compensation techniques are carried out in this low-dimensional feature space. Most of the compensation techniques take the sets of extracted i-vectors as input. By constructing between-class covariance and within-class covariance, we attempt to minimize the between-class variance mainly caused by channel effect and to maximize the variance between speakers. In the real-world application, enrollment and test data from each user (or speaker) are always scarce. Although it is widely thought that session variability is mostly caused by channel effects, phonetic variability, as a factor that causes session variability, is still a matter to be considered. We propose in this paper a new i-vector extraction algorithm from the total factor matrix which we term component reduction analysis (CRA). This new algorithm contributes to better modelling of session variability in the total factor space. We reported results on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation (SREs) dataset. As measured both by equal error rate and the minimum values of the NIST detection cost function, 10–15 % relative improvement is achieved compared to the baseline of traditional i-vector-based system. Keywords Speaker verification i-vector Total factor space Phonetic variability Component reduction analysis (CRA)
机译:近年来,基于i-vector的框架已被证明可以在说话者验证中提供最先进的性能。每个话语都投影到总因子空间上,并由低维特征向量表示。在此低维特征空间中执行通道补偿技术。大多数补偿技术都将提取的i向量集作为输入。通过构造类间协方差和类内协方差,我们尝试最小化主要由通道效应引起的类间方差,并最大化说话者之间的方差。在实际应用程序中,每个用户(或演讲者)的注册和测试数据始终很少。尽管广泛认为会话可变性主要是由信道效应引起的,但是语音可变性作为导致会话可变性的因素,仍然是要考虑的问题。我们在本文中提出了一种新的从总因子矩阵中提取i-向量的算法,我们将其称为分量约简分析(CRA)。这种新算法有助于在总因素空间中更好地建模会话可变性。我们报告了NIST 2008说话者识别评估(SRE)数据集核心条件的男性英语试验结果。通过均等错误率和NIST检测成本函数的最小值来衡量,与传统基于i-vector的系统的基线相比,可实现10-15%的相对改进。说话人验证i-向量总因子空间语音变异性成分约简分析(CRA)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号