Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer

机译：专注于现在：ASR源极关注层的正则化方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a novel method to diagnose the source-target attention in state-of-the-art end-to-end speech recognition models with joint connectionist temporal classification (CTC) and attention training. Our method is based on the fact that both, CTC and source-target attention, are acting on the same encoder representations. To understand the functionality of the attention, CTC is applied to compute the token posteriors given the attention outputs. We found that the source-target attention heads are able to predict several tokens ahead of the current one. Inspired by the observation, a new regularization method is proposed which leverages CTC to make source-target attention more focused on the frames corresponding to the output token being predicted by the decoder. Experiments reveal stable improvements up to 7% and 13% relatively with the proposed regularization on TED-LIUM 2 and Librispeech.

机译：本文介绍了一种新的方法，可以在最先进的端到端语音识别模型中诊断源极关注，具有关节连接主人时间分类（CTC）和注意力训练。我们的方法基于以下事实，即CTC和源极关注，采用相同的编码器表示。要了解注意力，CTC应用于指定注意输出来计算令牌后辅导。我们发现，源极点关注头能够在当前的位置预测几个令牌。通过观察的启发，提出了一种新的正则化方法，其利用CTC使源极关注更专注于对应于解码器预测的输出令牌的帧。实验显示稳定的改善高达7％和13％，在TED-lium 2和Libriispeech上的拟议正则化。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|5994-5998|共5页
会议地点
作者
Nanxin Chen; Piotr Żelasko; Jesús Villalba; Najim Dehak;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Adaptation models; Conferences; Computational modeling; Speech recognition; Signal processing; Acoustics;

机译：培训;适应模型;会议;计算建模;语音识别;信号处理;声学;

相似文献

外文文献
中文文献
专利

1. The epitaxial growth of the PrCaSrMnO and LaCaMnO/PrCaSrMnO/LaCaMnO multilayer thin films with CMR effects prepared by a new method: Precursor Film Sintering [J] . Hui Liu, Ying Luo, Ming Li Physica, B. Condensed Matter . 2007,第1a2期

机译：通过新方法制备的具有CMR效应的PrCaSrMnO和LaCaMnO / PrCaSrMnO / LaCaMnO多层薄膜的外延生长：前体膜烧结
2. The epitaxial growth of the PrCaSrMnO and LaCaMnO/PrCaSrMnO/LaCaMnO multilayer thin films with CMR effects prepared by a new method: Precursor Film Sintering [J] . Hui Liu, Ying Luo, Ming Li Physica, B. Condensed Matter . 2007,第1a2期

机译：通过新方法制备的具有CMR效应的PrCaSrMnO和LaCaMnO / PrCaSrMnO / LaCaMnO多层薄膜的外延生长：前体膜烧结
3. Automatic estimation of the regularization parameter in 2D focusing gravity inversion: application of the method to the Safo manganese mine in the northwest of Iran [J] . Saeed Vatankhah, Vahid E Ardestani, Rosemary A Renaut Journal of geophysics and engineering . 2014,第4期

机译：二维聚焦重力反演中正则化参数的自动估计：该方法在伊朗西北部萨福锰矿的应用
4. TWO LAYERS ACTION INTEGRATION FOR HRI Action Integration with Attention Focusing for Interactive Robots [C] . Yasser Mohammad, Toyoaki Nishida International Conference on Informatics in Control, Automation and Robotics . 2004

机译：HRI动作集成的两层动作集成与互动机器人的关注集成
5. The Effects of Broken-Down Focus of Attention Instructions on Volleyball Setting Performance of Skilled and Novice Players [D] . de Arruda, Danilo Gomes. 2021

机译：注意力指令分解指令对熟练和新手球员排球型绩效的影响
6. A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding [O] . Daniel D. E. Wong, Søren A. Fuglsang, Jens Hjortkjær, 2018

机译：听觉注意解码正反模型中正则化方法的比较
7. Study on Platinum Coating Depth in Focused Ion Beam Diamond Cutting Tool Milling and Methods for Removing Platinum Layer [O] . Woong Kirl Choi, Seung Yub Baek 2015

机译：聚焦离子束金刚石刀具铣削中铂金涂层深度的研究及铂层去除方法

Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer

摘要

著录项

相似文献

相关主题

期刊订阅