首页> 外文期刊>Computer speech and language >Simplified supervised ⅰ-vector modeling with application to robust and efficient language identification and speaker verification
【24h】

Simplified supervised ⅰ-vector modeling with application to robust and efficient language identification and speaker verification

机译:简化的监督ⅰ-矢量建模及其在健壮高效的语言识别和说话人验证中的应用

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a simplified and supervised ⅰ-vector modeling approach with applications to robust and efficient language identification and speaker verification. First, by concatenating the label vector and the linear regression matrix at the end of the mean supervector and the ⅰ-vector factor loading matrix, respectively, the traditional ⅰ-vectors are extended to label-regularized supervised ⅰ-vectors. These supervised ⅰ-vectors are optimized to not only reconstruct the mean supervectors well but also minimize the mean square error between the original and the reconstructed label vectors to make the supervised ⅰ-vectors become more discriminative in terms of the label information. Second, factor analysis (FA) is performed on the pre-normalized centered GMM first order statistics supervector to ensure each gaussian component's statistics sub-vector is treated equally in the FA, which reduces the computational cost by a factor of 25 in the simplified ⅰ-vector framework. Third, since the entire matrix inversion term in the simplified ⅰ-vector extraction only depends on one single variable (total frame number), we make a global table of the resulting matrices against the frame numbers' log values. Using this lookup table, each utterance's simplified ⅰ-vector extraction is further sped up by a factor of 4 and suffers only a small quantization error. Finally, the simplified version of the supervised ⅰ-vector modeling is proposed to enhance both the robustness and efficiency. The proposed methods are evaluated on the DARPA RATS dev2 task, the NIST LRE 2007 genera! task and the NIST SRE 2010 female condition 5 task for noisy channel language identification, clean channel language identification and clean channel speaker verification, respectively. For language identification on the DARPA RATS, the simplified supervised ⅰ-vector modeling achieved 2%, 16%, and 7% relative equal error rate (EER) reduction on three different feature sets and sped up by a factor of more than 100 against the baseline ⅰ-vector method for the 120 s task. Similar results were observed on the NIST LRE 2007 30 s task with 7% relative average cost reduction. Results also show that the use of Gammatone frequency cepstral coefficients, Mel-frequency cepstral coefficients and spectro-temporal Gabor features in conjunction with shifted-delta-cepstral features improves the overall language identification performance significantly. For speaker verification, the proposed supervised ⅰ-vector approach outperforms the ⅰ-vector baseline by relatively 12% and 7% in terms of EER and norm old minDCF values, respectively.
机译:本文提出了一种简化且受监督的ⅰ-矢量建模方法,并将其应用于鲁棒而有效的语言识别和说话人验证。首先,通过将标签向量和线性回归矩阵分别连接在平均超向量和the-向量因子加载矩阵的末尾,将传统的ⅰ-向量扩展到标签正则化监督的ⅰ-向量。这些有监督的ⅰ向量经过优化,不仅可以很好地重构平均超向量,而且还可以最大程度地减少原始标签向量和重构的标签向量之间的均方误差,从而使有监督的ⅰ向量在标签信息方面更具区分性。其次,对预归一化的中心GMM一阶统计超向量执行因子分析(FA),以确保FA中每个高斯分量的统计子向量均被平等对待,从而在简化ⅰ中将计算成本降低了25倍。 -vector框架。第三,由于简化的vector-向量提取中的整个矩阵求逆项仅取决于一个变量(总帧数),因此我们针对帧数的对数值创建了一个所得矩阵的全局表。使用该查找表,每个话语的简化ⅰ矢量提取进一步加快了4倍,并且仅遭受了很小的量化误差。最后,提出了有监督的ⅰ-矢量建模的简化版本,以增强鲁棒性和效率。建议的方法在DARPA RATS dev2任务(NIST LRE 2007属)上进行了评估!任务和NIST SRE 2010女性状况5任务,分别用于嘈杂的通道语言识别,干净的通道语言识别和干净的通道说话者验证。为了在DARPA RATS上进行语言识别,简化的监督vector矢量建模在三个不同的特征集上实现了2%,16%和7%的相对等错误率(EER)降低,并且相对于120 s任务的基线ⅰ向量法。在NIST LRE 2007 30 s任务上观察到了类似的结果,相对平均成本降低了7%。结果还表明,将Gammatone频率倒谱系数,Mel频率倒谱系数和光谱时Gabor特征与偏移δ倒谱特征结合使用可显着提高整体语言识别性能。对于说话者验证,就EER和规范的minDCF值而言,拟议的监督vector-矢量方法分别比ⅰ-矢量基线高出约12%和7%。

著录项

  • 来源
    《Computer speech and language》 |2014年第4期|940-958|共19页
  • 作者

    Ming Li; Shrikanth Narayanan;

  • 作者单位

    Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, CA 90089, USA,Sun Yat-Sen University Carnegie Mellon University Joint Institute of Engineering, Sun Yat-Sen University, Guangzhou, China,Sun Yat-Sen University Carnegie Mellon University Shunde International Joint Research Institute, Shunde, China;

    Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, CA 90089, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Language identification; Speaker verification; Ⅰ-vector; Supervised ⅰ-vector; Simplified ⅰ-vector; Simplified supervised ⅰ-vector;

    机译:语言识别;说话者验证;Ⅰ-载体;有监督的ⅰ-向量;简化的ⅰ-向量;简化监督ⅰ向量;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号