首页> 外文会议>Chinese Control and Decision Conference >End-to-End Feature Learning for Text-Independent Speaker Verification
【24h】

End-to-End Feature Learning for Text-Independent Speaker Verification

机译:端到端特征学习,用于独立于文本的说话者验证

获取原文

摘要

Deep neural networks (DNNs) have found widespread use in text-independent speaker verification, especially the convolutional models with triplet loss. However, the training efficiency and the quality of learned features are not sufficiently good. In this paper, we present an end-to-end framework to train speaker verification models efficiently. In details, we introduce redesigned residual blocks in neural network architecture and propose a way of selecting hard triplets to improve original triplet loss function. Furthermore, the effects of hyperparameters and framing strategy in input pipeline are investigated for fine-tuning. Experimental results on the Librispeech and AISHELL-2 datasets demonstrate that the proposed method can reduce the verification equal error rate by greater than 20% relatively, which confirms the advantage of proposed methods comparing to methods in previous work.
机译:深度神经网络(DNN)已在与文本无关的说话者验证中得到广泛使用,尤其是具有三重态损失的卷积模型。但是,训练效率和学习特征的质量不够好。在本文中,我们提出了一个端到端框架来有效地训练说话者验证模型。详细地,我们在神经网络体系结构中介绍了重新设计的残差块,并提出了一种选择硬三联体以改善原始三重态损失函数的方法。此外,研究了超参数和成帧策略在输入管道中的影响,以进行微调。在Librispeech和AISHELL-2数据集上的实验结果表明,所提出的方法可以将验证均等错误率相对降低20%以上,这证实了所提出方法与以前工作相比的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号