Cascaded Encoders for Unifying Streaming and Non-Streaming ASR

机译：级联编码器，用于统一流媒体和非流媒体ASR

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

End-to-end (E2E) automatic speech recognition (ASR) models, by now, have shown competitive performance on several benchmarks. These models are structured to either operate in streaming or non-streaming mode. This work presents cascaded encoders for building a single E2E ASR model that can operate in both these modes simultaneously. The proposed model consists of streaming and non-streaming encoders. Input features are first processed by the streaming encoder; the non-streaming encoder operates exclusively on the output of the streaming encoder. A single decoder then learns to decode either using the output of the streaming or the non-streaming encoder. Results show that this model achieves similar word error rates (WER) as a standalone streaming model when operating in streaming mode, and obtains 10% – 27% relative improvement when operating in non-streaming mode. Our results also show that the proposed approach outperforms existing E2E two-pass models, especially on long-form speech.

机译：端到端（E2E）自动语音识别（ASR）模型，现在，在几个基准上显示了竞争性能。这些模型构造成在流或非流模式下操作。这项工作介绍了级联编码器，用于构建一个E2E ASR模型，可以同时在这两种模式下操作。所提出的模型包括流和非流式编码器。输入功能首先由流编码器处理;非流式编码器专门运行在流编码器的输出上。然后，单个解码器学习使用流或非流编码器的输出来解码解码。结果表明，该模型在流模式下运行时，该模型将类似的单词误差率（WER）作为独立流式流模型，并在非流模式下运行时获得10％-27％的相对改进。我们的研究结果还表明，该方法优于现有的E2E双通模型，特别是在长期言论中。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|5629-5633|共5页
会议地点
作者
Arun Narayanan; Tara N. Sainath; Ruoming Pang; Jiahui Yu; Chung-Cheng Chiu; Rohit Prabhavalkar; Ehsan Variani; Trevor Strohman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Error analysis; Computational modeling; Conferences; Computer architecture; Signal processing; Acoustics;

机译：训练;错误分析;计算建模;会议;计算机架构;信号处理;声学;
入库时间 2022-08-26 13:57:29

相似文献

外文文献
中文文献
专利

1. A Unified ASrchitecture Model of Web Applications [J] . 无上海大学学报：英文版 . 2002,第003期

机译：Web应用程序的统一体系结构模型
2. CASCADE MACROINVERTEBRATE ASSEMBLAGES FOR IN-STREAM FLOW CRITERIA AND BIOMONITORING OF TROPICAL MOUNTAIN STREAMS [J] . M. E. SHODA, K. R. GORBACH, M. E. BENBOW, River Research and Applications . 2012,第3期

机译：热带山区流内流标准和生物监测的级联无脊椎动物组合
3. Patent Application Titled "Systems and Methods for Saving Encoded Media Streamed Using Adaptive Bitrate Streaming" Under Review [J] . Journal of Engineering . 2013,第13期

机译：审查中的标题为“用于保存使用自适应比特率流传输的编码媒体流的系统和方法”的专利申请
4. Design of a low complexity video encoder for non-streaming video applications [C] . Krause, M., Muller, . 2002

机译：用于非流视频应用的低复杂度视频编码器的设计
5. A framework for quality-adaptive media streaming: Encode once - stream anywhere. [D] . Krasic, Charles. 2004

机译：适用于质量的媒体流的框架：编码一次-到任何地方流。
6. Streaming MASSIF: Cascading Reasoning for Efficient Processing of IoT Data Streams [O] . Pieter Bonte, Riccardo Tommasini, Emanuele Della Valle, 2018

机译：流式MASSIF：物联网数据流高效处理的级联推理
7. Stream temperature response to partial canopy removal in first and second order streams in the central Oregon Cascades [O] . 2002

机译：俄勒冈中部小瀑布一阶和二阶流对部分冠层去除的流温响应
8. Benefits of a Unified LaSRS++ Simulation for NAS-Wide and High-Fidelity Modeling. [R] . Glaab, P., Madden, M. 2014

机译：统一的LasRs ++仿真对Nas和高保真建模的好处。

Cascaded Encoders for Unifying Streaming and Non-Streaming ASR

摘要

著录项

相似文献

相关主题

期刊订阅