首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >A Real-Time End-to-End Multilingual Speech Recognition Architecture
【24h】

A Real-Time End-to-End Multilingual Speech Recognition Architecture

机译:实时端到端多语言语音识别架构

获取原文
获取原文并翻译 | 示例
           

摘要

Automatic speech recognition (ASR) systems are used daily by millions of people worldwide to dictate messages, control devices, initiate searches or to facilitate data input in small devices. The user experience in these scenarios depends on the quality of the speech transcriptions and on the responsiveness of the system. For multilingual users, a further obstacle to natural interaction is the monolingual character of many ASR systems, in which users are constrained to a single preset language. In this work, we present an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages. We leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.
机译:自动语音识别(ASR)系统每天被全球数百万人使用,用于指示消息,控制设备,启动搜索或促进小型设备中的数据输入。这些情况下的用户体验取决于语音转录的质量以及系统的响应能力。对于多语种用户,自然交互的另一个障碍是许多ASR系统的单语性,其中用户只能使用一种预设语言。在这项工作中,我们介绍了Google开发和部署的端到端多语言ASR体系结构,该体系结构允许用户选择口语的任意组合。我们利用语言识别的最新进展和实时语言选择的新颖方法来实现与单语言系统相似的识别精度和几乎相同的延迟特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号