ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder

机译：Orthro：与双解码器的非自动增加端到端语音翻译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Fast inference speed is an important goal towards real-world deployment of speech translation (ST) systems. End-to-end (E2E) models based on the encoder-decoder architecture are more suitable for this goal than traditional cascaded systems, but their effectiveness regarding decoding speed has not been explored so far. Inspired by recent progress in non-autoregressive (NAR) methods in text-based translation, which generates target tokens in parallel by eliminating conditional dependencies, we study the problem of NAR decoding for E2E-ST. We propose a novel NAR E2E-ST framework, Orthros, in which both NAR and autoregressive (AR) decoders are jointly trained on the shared speech encoder. The latter is used for selecting better translation among various length candidates generated from the former, which dramatically improves the effectiveness of a large length beam with negligible overhead. We further investigate effective length prediction methods from speech inputs and the impact of vocabulary sizes. Experiments on four benchmarks show the effectiveness of the proposed method in improving inference speed while maintaining competitive translation quality compared to state-of-the-art AR E2E-ST systems.

机译：快速推断速度是真实世界部署语音翻译（ST）系统的重要目标。基于编码器 - 解码器架构的端到端（E2E）模型比传统的级联系统更适合于该目标，但到目前为止还没有探索其关于解码速度的效果。灵感来自最近在基于文本的翻译中的非自动增加（NAR）方法的进展，这通过消除条件依赖性并行生成目标令牌，我们研究了E2E-ST的NAR解码问题。我们提出了一种新颖的NAR E2E-ST框架，奥特罗斯，其中NAR和自回归（AR）解码器在共享语音编码器上接受过共同培训。后者用于选择从前者产生的各种长度候选者之间的更好的平移，这显着提高了具有可忽略的开销的大长度光束的有效性。我们进一步研究了语音输入的有效长度预测方法和词汇量的影响。四个基准测试的实验表明了提出方法提高推理速度的有效性，同时保持竞争翻译质量与最先进的AR E2E-ST系统相比。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|7503-7507|共5页
会议地点
作者
Hirofumi Inaguma; Yosuke Higuchi; Kevin Duh; Tatsuya Kawahara; Shinji Watanabe;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Vocabulary; Conferences; Prediction methods; Signal processing; Benchmark testing; Acoustics; Decoding;

机译：词汇;会议;预测方法;信号处理;基准测试;声学;解码;

相似文献

外文文献
中文文献
专利

1. Non-Autoregressive Transformer for Speech Recognition [J] . Nanxin Chen, Shinji Watanabe, Jesús Villalba, IEEE signal processing letters . 2021,第1期

机译：用于语音识别的非自动进口变压器
2. MuST-C: A multilingual corpus for end-to-end speech translation [J] . Roldano Cattoni, Mattia Antonino Di Gangi, Luisa Bentivogli, Computer speech and language . 2021,第Mara期

机译：Must-C：结束地点翻译的多语种语料库
3. Representation transfer learning from deep end-to-end speech recognition networks for the classification of health states from speech [J] . Benjamin Sertolli, Zhao Ren, Bjoern W. Schuller, Computer speech and language . 2021,第Jula期

机译：从言语中，从深端到端语音识别网络中的代表转移学习
4. Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation [C] . Hang Le, Juan Pino, Changhan Wang, International Conference on Computational Linguistics . 2020

机译：双解码器变压器，用于联合自动语音识别和多语言语音翻译
5. Speech act stylistics: A cross-linguistic, cross-cultural study of directive speech acts in selected Shakespearean plays and their Arabic translations (William Shakespeare). [D] . Jarbou, Samer Omar. 2002

机译：言语行为文体学：莎士比亚戏剧及其阿拉伯语译本（威廉·莎士比亚）中对指示性言语行为的跨语言，跨文化研究。
6. Dynamic Acoustic Unit Augmentation with BPE-Dropout for Low-Resource End-to-End Speech Recognition [O] . Aleksandr Laptev, Andrei Andrusenko, Ivan Podluzhny, 2021

机译：用BPE-ropout进行动态声学单元增强用于低资源端到端语音识别
7. ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder [O] . Hirofumi Inaguma, Yosuke Higuchi, Kevin Duh, 2021

机译：Orthro：与双解码器的非自动增加端到端语音翻译

ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder

摘要

著录项

相似文献

相关主题

期刊订阅