A Real-Time End-to-End Multilingual Speech Recognition Architecture

Gonzalez-Dominguez Javier; Eustis David; Lopez-Moreno Ignacio; Senior Andrew; Beaufays Francoise; Moreno Pedro J.

首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >A Real-Time End-to-End Multilingual Speech Recognition Architecture

【24h】

A Real-Time End-to-End Multilingual Speech Recognition Architecture

机译：实时端到端多语言语音识别架构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic speech recognition (ASR) systems are used daily by millions of people worldwide to dictate messages, control devices, initiate searches or to facilitate data input in small devices. The user experience in these scenarios depends on the quality of the speech transcriptions and on the responsiveness of the system. For multilingual users, a further obstacle to natural interaction is the monolingual character of many ASR systems, in which users are constrained to a single preset language. In this work, we present an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages. We leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.

机译：自动语音识别（ASR）系统每天被全球数百万人使用，用于指示消息，控制设备，启动搜索或促进小型设备中的数据输入。这些情况下的用户体验取决于语音转录的质量以及系统的响应能力。对于多语种用户，自然交互的另一个障碍是许多ASR系统的单语性，其中用户只能使用一种预设语言。在这项工作中，我们介绍了Google开发和部署的端到端多语言ASR体系结构，该体系结构允许用户选择口语的任意组合。我们利用语言识别的最新进展和实时语言选择的新颖方法来实现与单语言系统相似的识别精度和几乎相同的延迟特性。

著录项

来源
《Selected Topics in Signal Processing, IEEE Journal of》 |2015年第4期|749-759|共11页
作者
Gonzalez-Dominguez Javier; Eustis David; Lopez-Moreno Ignacio; Senior Andrew; Beaufays Francoise; Moreno Pedro J.;
展开▼
作者单位

Google Inc. and Universidad Autonoma de Madrid, Madrid,;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computer architecture; Google; Pipelines; Real-time systems; Signal processing; Speech; Speech recognition; Automatic speech recognition (ASR); deep neural network (DNN); language identification (LID); multilingual;

机译：计算机体系结构;谷歌;管道;实时系统;信号处理;语音;语音识别;自动语音识别（ASR）;深度神经网络（DNN）;语言识别（LID）;多语言;

相似文献

外文文献
中文文献
专利

1. End-to-End Multilingual Speech Recognition System with Language Supervision Training [J] . Danyang LIU, Ji XU, Pengyuan ZHANG IEICE transactions on information and systems . 2020,第6期

机译：具有语言监督培训的端到端多语言语音识别系统
2. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition [J] . Shinji Watanabe, Takaaki Hori, Suyoun Kim, Selected Topics in Signal Processing, IEEE Journal of . 2017,第8期

机译：端到端语音识别的混合CTC /注意架构
3. Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming [J] . Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, Selected Topics in Signal Processing, IEEE Journal of . 2017,第8期

机译：神经波束形成的多通道端到端语音识别的统一架构
4. End-to-End Multilingual Automatic Speech Recognition for Less-Resourced Languages: The Case of Four Ethiopian Languages [C] . Solomon Teferra Abate, Martha Yifiru Tachbelie, Tanja Schultz IEEE International Conference on Acoustics, Speech and Signal Processing . 2021

机译：少资源语言的端到端多语言自动语音识别：四种埃塞俄比亚语言的情况
5. End-to-End Speech Recognition on Conversations [D] . Kim, Suyoun . 2019

机译：对话的端到端语音识别
6. The accuracy of radiology speech recognition reports in a multilingual South African teaching hospital [O] . Jacqueline du Toit, Retha Hattingh, Richard Pitcher 2015

机译：南非多语言教学医院放射学语音识别报告的准确性
7. End-to-End Multilingual Speech Recognition System with Language Supervision Training [O] . Danyang LIU, Ji XU, Pengyuan ZHANG 2020

机译：具有语言监督培训的端到端多语言语音识别系统

A Real-Time End-to-End Multilingual Speech Recognition Architecture

摘要

著录项

相似文献

相关主题

期刊订阅