首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation
【24h】

Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation

机译:分析低资源语音到文本翻译的ASR预介绍

获取原文

摘要

Previous work has shown that for low-resource source languages, automatic speech-to-text translation (AST) can be improved by pre-training an end-to-end model on automatic speech recognition (ASR) data from a high-resource language. However, it is not clear what factors - e.g., language relatedness or size of the pretraining data - yield the biggest improvements, or whether pretraining can be effectively combined with other methods such as data augmentation. Here, we experiment with pretraining on datasets of varying sizes, including languages related and unrelated to the AST source language. We find that the best predictor of final AST performance is the word error rate of the pretrained ASR model, and that differences in ASR/AST performance correlate with how phonetic information is encoded in the later RNN layers of our model. We also show that pretraining and data augmentation yield complementary benefits for AST.
机译:以前的工作表明,对于低资源源语言,可以通过从高资源语言的自动语音识别(ASR)数据上进行端到端模型来提高自动语音到文本转换(AST) 。 但是,目前尚不清楚哪些因素 - 例如,预先训练数据的语言相关性或大小 - 产生最大的改进,或者是否可以有效地与其他方法(如数据)有效地结合预威胁。 在这里,我们试验在不同尺寸的数据集上预先预订,包括与AST源语言相关和无关的语言。 我们发现最终AST性能的最佳预测因子是普里雷托·ASR模型的字错误率,ASR / AST性能的差异与语音信息如何在我们模型的后续RNN层中编码。 我们还表明预先训练和数据增强的互补益处为AST。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号