首页> 外国专利> Systems and methods for robust speech recognition using generative adversarial networks

Systems and methods for robust speech recognition using generative adversarial networks

机译：使用生成对抗网络的强大语音识别的系统和方法

页面导航

摘要
著录项
相似文献

摘要

Described herein are systems and methods for a general, scalable, end-to-end framework that uses a generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Embodiments of a Wasserstein GAN framework increase the robustness of seq-to-seq models in a scalable, end-to-end fashion. In one or more embodiments, an encoder component is treated as the generator of GAN and is trained to produce indistinguishable embeddings between labeled and unlabeled audio samples. This new robust training approach can learn to induce robustness without alignment or complicated inference pipeline and even where augmentation of audio data is not possible.

机译：这里描述的是用于使用生成的对冲网络（GaN）目的来实现鲁棒语音识别的一般，可扩展的端到端框架的系统和方法。通过拟议方法培训的编码器通过学习将嘈杂的音频映射到与清洁音频相同的嵌入空间相同的嵌入空间，享有改善的不变性。 Wassersein GaN框架的实施例增加了SEQ-TO-SEQ模型以可扩展的端到端时尚的鲁棒性。在一个或多个实施例中，编码器组件被视为GaN的发电机，并且经过培训，以在标记和未标记的音频样本之间产生无法区分的嵌入。这种新的强大培训方法可以学习在没有对齐或复杂推理管道的情况下诱导稳健性，并且即使不可能增加音频数据的地方。

著录项

公开/公告号US10971142B2

专利类型
公开/公告日2021-04-06

原文格式PDF
申请/专利权人 BAIDU USA LLC;
展开▼

申请/专利号US201816154648
发明设计人 ANUROOP SRIRAM;HEE WOO JUN;YASHESH GAUR;SANJEEV SATHEESH;
展开▼

申请日2018-10-08
分类号G10L15/20;G10L15/06;G10L15/16;
国家 US
入库时间 2022-08-24 18:04:40

相似文献

专利
外文文献
中文文献