Exploring A Zero-Order Direct Hmm Based on Latent Attention for Automatic Speech Recognition

机译：基于自动语音识别的潜在关注探索零级直接嗯

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we study a simple yet elegant latent variable attention model for automatic speech recognition (ASR) which enables an integration of attention sequence modeling into the direct hidden Markov model (HMM) concept. We use a sequence of hidden variables that establishes a mapping from output labels to input frames. Inspired by the direct HMM model, we assume a decomposition of the label sequence posterior into emission and transition probabilities using zero-order assumption and incorporate both Transformer and LSTM attention models into it. The method keeps the explicit alignment as part of the stochastic model and combines the ease of the end-to-end training of the attention model as well as an efficient and simple beam search. To study the effect of the latent model, we qualitatively analyze the alignment behavior of the different approaches. Our experiments on three ASR tasks show promising results in WER with more focused alignments in comparison to the attention models.

机译：在本文中，我们研究了一个简单而优雅的潜在可变注意模型，用于自动语音识别（ASR），它能够将注意序列建模集成到直接隐藏的马尔可夫模型（HMM）概念中。我们使用一系列隐藏变量，该序列将从输出标签建立映射到输入帧。灵感来自直接肝脏模型，我们假设使用零阶假设的发射和过渡概率后面的标签序列分解，并将变压器和LSTM注意模型结合到其中。该方法将显式对齐作为随机模型的一部分，并结合了注意力模型的易于结束训练以及高效和简单的光束搜索。为研究潜伏模型的影响，我们定性地分析了不同方法的对准行为。我们对三个ASR任务的实验表明，与注意模型相比，WER的有希望的结果具有更大的对齐。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p7444-8063|共5页
会议地点
作者
Parnia Bahar; Nikita Makarov; Albert Zeyer; Ralf Schluter; Hermann Ney;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
End-to-end speech recognition; Latent models; direct HMM; Attention; Transformer; LSTM;

机译：端到端语音识别;潜在模型;直接嗯;注意;变压器;LSTM;

相似文献

外文文献
中文文献
专利

1. Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition [J] . Xiaodong Cui Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第7期

机译：基于HMM的自动语音识别的多视图多目标半监督学习
2. Non-parametric probability estimation for HMM-based automatic speech recognition [J] . Fabrice Lefevre Computer speech and language . 2003,第2a3期

机译：基于HMM的自动语音识别的非参数概率估计
3. Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition [J] . Yoo Rhee Oh, Kiyoung Park, Hyung‐Bae Jeon, ETRI journal . 2020,第5期

机译：使用基于双向LSTM的语音识别，非洲主义韩国语音的自动能力评估大声朗读
4. Exploring A Zero-Order Direct Hmm Based on Latent Attention for Automatic Speech Recognition [C] . Parnia Bahar, Nikita Makarov, Albert Zeyer, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：基于潜在注意力的语音自动识别零阶直接Hmm
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Estimation of Phoneme-Specific HMM Topologies for the Automatic Recognition of Dysarthric Speech [O] . Santiago-Omar Caballero-Morales 2013

机译：语音异常自动识别的音素特定HMM拓扑估计
7. Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition [O] . Ziping Zhao, Yu Zheng, Zixing Zhang, 2018

机译：通过整合基于关注的Bidirection-LSTM-RNN和FCN来探索时空表示进行语音情感识别

Exploring A Zero-Order Direct Hmm Based on Latent Attention for Automatic Speech Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅