首页> 外国专利> SYSTEMS AND METHODS FOR A MULTILINGUAL SPEECH RECOGNITION FRAMEWORK

SYSTEMS AND METHODS FOR A MULTILINGUAL SPEECH RECOGNITION FRAMEWORK

机译:多语言语音识别框架的系统和方法

摘要

Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
机译:本文描述的实施例提供了一种用于多语言语音识别模型的自适应和调整(A2)机制,该机制将自适应和调整方法结合为集成的端到端训练,以改进模型的泛化并缓解长尾问题。具体来说,使用多语言语言模型mBERT,并将其转换为自回归变换器解码器。此外,在mBERT的自我注意层之上的编码器中添加了一个交叉注意模块,以便探索文本空间之外的声学空间。编码器和mBERT解码器的联合训练可以弥合语音和文本之间的语义鸿沟。

著录项

  • 公开/公告号US2022108688A1

    专利类型

  • 公开/公告日2022-04-07

    原文格式PDF

  • 申请/专利权人 SALESFORCE.COM INC.;

    申请/专利号US202117162624

  • 申请日2021-01-29

  • 分类号G10L15/16;G10L15/065;G10L15/06;G06N3/04;G06N3/08;

  • 国家 US

  • 入库时间 2022-08-25 00:22:59

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号