首页> 外文会议>2017 IEEE Automatic Speech Recognition and Understanding Workshop >Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks
【24h】

Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks

机译:使用非并行语料库和周期一致的对抗网络进行跨域语音识别

获取原文
获取原文并翻译 | 示例

摘要

Automatic speech recognition (ASR) systems often does not perform well when it is used in a different acoustic domain from the training time, such as utterances spoken in noisy environments or in different speaking styles. We propose a novel approach to cross-domain speech recognition based on acoustic feature mappings provided by a deep neural network, which is trained using nonparallel speech corpora from two different domains and using no phone labels. For training a target domain acoustic model, we generate “fake” target speech features from the labeleld source domain features using a mapping Gf. We can also generate “fake” source features for testing from the target features using the backward mapping Gbwhich has been learned simultaneously with G f. The mappings G f and Gbare trained as adversarial networks using a conventional adversarial loss and a cycle-consistency loss criterion that encourages the backward mapping to bring the translated feature back to the original as much as possible such that Gb(Gf (x)) ≈ x. In a highly challenging task of model adaptation only using domain speech features, our method achieved up to 16 % relative improvements in WER in the evaluation using the CHiME3 real test data. The backward mapping was also confirmed to be effective with a speaking style adaptation task.
机译:当自动语音识别(ASR)系统在与培训时间不同的声学域中使用时(例如在嘈杂的环境中或以不同的讲话方式说出的话语),其性能通常不佳。我们提出了一种基于深度神经网络提供的声学特征映射的跨域语音识别的新方法,该方法使用来自两个不同域的非平行语音语料库并且不使用电话标签进行训练。为了训练目标域声学模型,我们使用映射Gf从标签域源域特征生成“伪”目标语音特征。我们还可以使用向后映射G \ n b \ n与G f同时学习。映射G f和G \ n b \ nare被训练为使用常规对抗损失和周期一致性损失准则的对抗网络,该准则鼓励后向映射将已翻译特征尽可能地带回到原始位置,从而使G \ n b < / sub> \ n(Gf(x))≈x。在仅使用域语音功能进行模型自适应的艰巨任务中,我们的方法在使用CHiME3真实测试数据进行评估时,WER相对提高了16%。向后映射也被证实对于说话风格的适应任务是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号