Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation

机译：分析低资源语音到文本翻译的ASR预介绍

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Previous work has shown that for low-resource source languages, automatic speech-to-text translation (AST) can be improved by pre-training an end-to-end model on automatic speech recognition (ASR) data from a high-resource language. However, it is not clear what factors - e.g., language relatedness or size of the pretraining data - yield the biggest improvements, or whether pretraining can be effectively combined with other methods such as data augmentation. Here, we experiment with pretraining on datasets of varying sizes, including languages related and unrelated to the AST source language. We find that the best predictor of final AST performance is the word error rate of the pretrained ASR model, and that differences in ASR/AST performance correlate with how phonetic information is encoded in the later RNN layers of our model. We also show that pretraining and data augmentation yield complementary benefits for AST.

机译：以前的工作表明，对于低资源源语言，可以通过从高资源语言的自动语音识别（ASR）数据上进行端到端模型来提高自动语音到文本转换（AST）。但是，目前尚不清楚哪些因素 - 例如，预先训练数据的语言相关性或大小 - 产生最大的改进，或者是否可以有效地与其他方法（如数据）有效地结合预威胁。在这里，我们试验在不同尺寸的数据集上预先预订，包括与AST源语言相关和无关的语言。我们发现最终AST性能的最佳预测因子是普里雷托·ASR模型的字错误率，ASR / AST性能的差异与语音信息如何在我们模型的后续RNN层中编码。我们还表明预先训练和数据增强的互补益处为AST。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p7444-8063|共5页
会议地点
作者
Mihaela C. Stoian; Sameer Bansal; Sharon Goldwater;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
speech-to-text translation; transfer learning; pre-training; speech recognition; data augmentation;

机译：语音到文本翻译;转移学习;预训练;语音识别;数据增强;

相似文献

外文文献
中文文献
专利

1. Construct validity and diagnostic accuracy of the Italian translation of the 18-Item World Health Organization Adult ADHD Self-Report Scale (ASRS-18) Italian translation in a sample of community-dwelling adolescents [J] . Somma Antonella, Borroni Serena, Fossati Andrea Psychiatry research . 2019,第期

机译：构建有效性和诊断准确性的18项世界卫生组织成人ADHD自我报告规模（ASRS-18）意大利语翻译在社区住宅的一名时样
2. Enhancing Comprehension of Lecture Content in a Foreign Language as the Medium of Instruction: Comparing Speech-to-Text Recognition With Speech-Enabled Language Translation [J] . Rustam Shadiev, Yu-Cheng Chien, Yueh-Min Huang SAGE Open . 2020,第3期

机译：以外语为讲座内容的理解为教学媒介：将语音到文本识别与启用语音的语言翻译进行比较
3. Facilitating cross-cultural understanding with learning activities supported by speech-to-text recognition and computer-aided translation [J] . Shadiev Rustam, Huang Yueh-Min Computers & education . 2016,第Jula期

机译：语音识别和计算机辅助翻译支持学习活动，促进跨文化理解
4. Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation [C] . Mihaela C. Stoian, Sameer Bansal, Sharon Goldwater IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：分析资源少的语音到文本翻译的ASR预训练
5. Parse decoration of the word sequence in the speech-to-text machine-translation pipeline. [D] . Kahn, Jeremy G. 2010

机译：在语音转文本机器翻译管道中解析单词序列的修饰。
6. IMDAV reaction between phenylmaleic anhydride and thienyl(furyl)allylamines: synthesis and molecular structure of (3aSR4RS4aRS7aSR)-5-oxothieno- and (3aSR4SR4aRS7aSR)-5-oxofuro23-fisoindole-4-carboxylic acids [O] . Flavien A. A. Toze, Maryana A. Nadirova, Dmitriy F. Mertsalov, 2018

机译：苯基马来酸酐与噻吩基（呋喃基）烯丙基胺的IMDAV反应：（3aSR4RS4aRS7aSR）-5-氧噻吩并-和（3aSR4SR4aRS7aSR）-5-氧呋喃23-的合成和分子结构f异吲哚-4-羧酸
7. Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation [O] . Mihaela C. Stoian, Sameer Bansal, Sharon Goldwater 2020

机译：分析了低资源语音到文本翻译的ASR预介绍

Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation

摘要

著录项

相似文献

相关主题

期刊订阅