Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition

机译：语音识别的序列到序列模型的耦合训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accented speech poses significant challenges for state-of-the-art automatic speech recognition (ASR) systems. Accent is a property of speech that lasts throughout an utterance in varying degrees of strength. This makes it hard to isolate the influence of accent on individual speech sounds. We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents. This training regime introduces an L2 loss between the attention-weighted representations corresponding to pairs of utterances with the same text, thus acting as a regularizer and encouraging representations from the encoder to be more accent-invariant. We focus on recognizing accented English samples from the Mozilla Common Voice corpus. We obtain significant error rate reductions on accented samples from a large set of diverse accents using coupled training. We also show consistent improvements in performance on heavily accented samples (as determined by a standalone accent classifier).

机译：重音对最先进的自动语音识别（ASR）系统提出了重大挑战。口音是语音的一种属性，它以不同的强度持续整个发声。这使得很难隔离口音对单个语音的影响。我们提出了针对编码器-解码器ASR模型的耦合训练，该模型可对与具有不同口音的说话者说出的相同文本相对应的发声对进行操作。该训练方案在与具有相同文本的成对发声相对应的注意力加权表示之间引入了L2损失，从而充当了正则化器，并鼓励编码器的表示更具重音不变性。我们致力于识别来自Mozilla Common Voice语料库的重音英语样本。我们使用耦合训练从大量不同的重音符号中获得重音符号样本的错误率显着降低。我们还显示了重音样本（由独立的重音分类器确定）在性能方面的持续改进。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|8254-8258|共5页
会议地点
作者
Vinit Unni; Nitish Joshi; Preethi Jyothi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Accented speech recognition; sequence-to-sequence models with attention; coupled training;

机译：强调语音识别;注意序列到序列模型;耦合训练;

相似文献

外文文献
中文文献
专利

1. Model Generation of Accented Speech using Model Transformation and Verification for Bilingual Speech Recognition [J] . HAN-PING SHEN, CHUNG-HSIEN WU, PEI-SHAN TSAI ACM transactions on Asian language information processing . 2015,第2期

机译：使用模型转换和验证进行语音识别的重音模型生成
2. i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition [J] . Behravan Hamid, Hautamaki Ville, Siniscalchi Sabato Marco, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第1期

机译：自动识别外国口音的语音属性的i-Vector建模
3. Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition [J] . Zhang, C., Liu, Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第10期

机译：具有区分性动态高斯混合选择的可靠口音特定单元生成，用于多口音中文语音识别
4. Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition [C] . Vinit Unni, Nitish Joshi, Preethi Jyothi IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：耦合训练重音语音识别的序列序列模型
5. Automatic Dialect and Accent Recognition and its Application to Speech Recognition [D] . Biadsy, Fadi 2011

机译：方言和重音自动识别及其在语音识别中的应用
6. The role of foreign accent and short-term exposure in speech-in-speech recognition [O] . Susanne Brouwer -1

机译：外国口音和短期接触在语音识别中的作用
7. PARTIAL CHANGE ACCENT MODELS FOR ACCENTED MANDARIN SPEECH RECOGNITION [O] . Liu Yi, Pascale Fung 2010

机译：重读普通话语音识别的部分更改重音模型

Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅