首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
【24h】

DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS

机译:基于I型向量的扬声器识别系统的持续时间不匹配补偿

获取原文

摘要

Speaker recognition systems trained on long duration utterances are known to perform significantly worse when short test segments are encountered. To address this mismatch, we analyze the effect of duration variability on phoneme distributions of speech utterances and i-vector length. We demonstrate that, as utterance duration is decreased, number of detected unique phonemes and i-vector length approaches zero in a logarithmic and non-linear fashion, respectively. Assuming duration variability as an additive noise in the i-vector space, we propose three different strategies for its compensation: i) multi-duration training in Probabilistic Linear Discriminant Analysis (PLDA) model, ii) score calibration using log duration as a Quality Measure Function (QMF), and iii) multi-duration PLDA training with synthesized short duration i-vectors. Experiments are designed based on the 2012 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) protocol with varying test utterance duration. Experimental results demonstrate the effectiveness of the proposed schemes on short duration test conditions, especially with the QMF calibration approach.
机译:扬声器识别系统已知在遇到短的测试段时,已知在长期持续时间发声中培训的系统显着更差。为了解决这种不匹配,我们分析了持续时间可变性对语音发声和I形向量长度的音素分布的影响。我们证明,随着话语持续时间减少,检测到的独特音素和I形载体长度分别以对数和非线性方式接近零。假设持续时间可变性作为I - 矢量空间中的添加性噪声,我们提出了三种不同的补偿策略:i)概率线性判别分析(PLDA)模型中的多持续时间培训,ii)使用日志持续时间作为质量测量的评分校准功能(QMF)和III)多持续时间PLDA培训,具有合成短持续时间I-向量。实验是根据2012年国家标准和技术研究所(NIST)扬声器识别评估(SRE)协议的设计,具有不同的测试话语持续时间。实验结果表明了提出的方案在短时间内测试条件下的有效性,特别是QMF校准方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号