首页> 外文会议>Workshop on Automatic Speech Recognition and Understanding >Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors
【24h】

Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors

机译:使用i-vectors的神经网络声学模型的扬声器适应

获取原文

摘要

We propose to adapt deep neural network (DNN) acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the network in parallel with the regular acoustic features for ASR. For both training and test, the i-vector for a given speaker is concatenated to every frame belonging to that speaker and changes across different speakers. Experimental results on a Switchboard 300 hours corpus show that DNNs trained on speaker independent features and i-vectors achieve a 10% relative improvement in word error rate (WER) over networks trained on speaker independent features only. These networks are comparable in performance to DNNs trained on speaker-adapted features (with VTLN and FMLLR) with the advantage that only one decoding pass is needed. Furthermore, networks trained on speaker-adapted features and i-vectors achieve a 5-6% relative improvement in WER after hessian-free sequence training over networks trained on speaker-adapted features only.
机译:我们建议通过将扬声器身份向量(I-Viptors)作为输入特征与ASR的常规声学特征并行,将扬声器标识等向量(I-Viptors)作为输入特征的输入特征向网络调整到目标扬声器的深度神经网络(DNN)声学模型。对于培训和测试,给定扬声器的I形载体将与属于该扬声器的每个帧连接到不同扬声器的每个帧。在交换机300小时的实验结果表明,DNN训练在扬声器独立特征和I-Vovors上仅在讲话者独立特征上培训的网络中获得了10%的相对改善,通过网络训练的网络中的单词错误率(WER)。这些网络在扬声器适应特征(具有VTLN和FMLLR)上培训的DNN的性能相当,其优点仅需要一个解码通过。此外,在讲话者的自由序列训练仅在讲述扬声器适应的功能上的网络训练后,网络培训的网络在扬声器适应的功能和I-vers达到了5-6%的相对改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号