首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Gaussian-constrained Training for Speaker Verification
【24h】

Gaussian-constrained Training for Speaker Verification

机译:高斯约束讲台验证培训

获取原文

摘要

Neural models, in particular the d-vector and x-vector architectures, have produced state-of-the-art performance on many speaker verification tasks. However, two potential problems of these neural models deserve more investigation. Firstly, both models suffer from 'information leak', which means that some parameters participating in model training will be discarded during inference, i.e, the layers that are used as the classifier. Secondly, these models do not regulate the distribution of the derived speaker vectors. This 'unconstrained distribution' may degrade the performance of the subsequent scoring component, e.g., PLDA. This paper proposes a Gaussian-constrained training approach that (1) discards the parametric classifier, and (2) enforces the distribution of the derived speaker vectors to be Gaussian. Our experiments on the VoxCeleb and SITW databases demonstrated that this new training approach produced more representative and regular speaker embeddings, leading to consistent performance improvement.
机译:神经模型,特别是D形矢量和X-向量架构,在许多扬声器验证任务上产生了最先进的性能。然而,这些神经模型的两个潜在问题值得更多的调查。首先,两种模型都遭受了“信息泄漏”,这意味着在推理期间将丢弃参与模型训练的一些参数,即用作分类器的层。其次,这些模型不调节派生扬声器矢量的分布。这种“无约束分布”可能会降低随后的评分组分,例如PLDA。本文提出了高斯受约束的训练方法,(1)丢弃参数分类器,(2)强制推导扬声器向量的分布为高斯。我们对VoxceleB和SITW数据库的实验表明,这种新的培训方法产生了更多代表性和规则的扬声器嵌入,导致始终如一的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号