High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model

机译：两头上下文层轨迹LSTM模型的高精度和低延迟语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While the community keeps promoting end-to-end models over conventional hybrid models, which usually are long short-term memory (LSTM) models trained with a cross entropy criterion followed by a sequence discriminative training criterion, we argue that such conventional hybrid models can still be significantly improved. In this paper, we detail our recent efforts to improve conventional hybrid LSTM acoustic models for high-accuracy and low-latency automatic speech recognition. To achieve high accuracy, we use a contextual layer trajectory LSTM (cltLSTM), which decouples the temporal modeling and target classification tasks, and incorporates future context frames to get more information for accurate acoustic modeling. We further improve the training strategy with sequence-level teacher-student learning. To obtain low latency, we design a two-head cltLSTM, in which one head has zero latency and the other head has a small latency, compared to an LSTM. When trained with Microsoft’s 65 thousand hours of anonymized training data and evaluated with test sets with 1.8 million words, the proposed two-head cltLSTM model with the proposed training strategy yields a 28.2% relative WER reduction over the conventional LSTM acoustic model, with a similar perceived latency.

机译：尽管社区一直在推广端到端模型，而不是传统的混合模型，而传统的混合模型通常是使用交叉熵准则和序列判别训练准则训练的长短期记忆（LSTM）模型，但我们认为此类常规混合模型可以仍需明显改善。在本文中，我们详细介绍了我们最近为改进传统的混合LSTM声学模型以实现高精度和低延迟自动语音识别所做的努力。为了实现高精度，我们使用上下文层轨迹LSTM（cltLSTM），它将时间建模与目标分类任务分离开，并结合了将来的上下文框架以获取更多信息，以进行准确的声学建模。我们通过序列级师生学习进一步改善培训策略。为了获得低延迟，我们设计了一个两头cltLSTM，与LSTM相比，其中一个头的延迟为零，而另一个头的延迟则较小。当使用Microsoft的6.5万小时的匿名培训数据进行培训并使用180万个单词的测试集进行评估时，具有建议的培训策略的拟议的两头cltLSTM模型与常规的LSTM声学模型相比，相对WER降低了28.2％感知延迟。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|7699-7703|共5页
会议地点
作者
Jinyu Li; Rui Zhao; Eric Sun; Jeremy H. M. Wong; Amit Das; Zhong Meng; Yifan Gong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
LSTM; teacher-student learning; automatic speech recognition; latency;

机译：LSTM;师生学习;语音自动识别;等待时间;

相似文献

外文文献
中文文献
专利

1. Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM [J] . Kadyan Virender, Dua Mohit, Dhiman Poonam International journal of speech technology . 2021,第2期

机译：使用Deep LSTM提高Punjabi语音识别系统的长语言依赖性的准确性
2. Attention guided 3D CNN-LSTM model for accurate speech based emotion recognition [J] . Atila Orhan, Sengur Abdulkadir Applied Acoustics . 2021,第Nova期

机译：关注引导3D CNN-LSTM模型，用于基于准确的语音情感识别
3. CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition [J] . Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：CTC正则化模型自适应，用于改进基于LSTM RNN的多口音普通话语音识别
4. High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model [C] . Jinyu Li, Rui Zhao, Eric Sun, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：具有双头上下文层轨迹LSTM模型的高精度和低延迟语音识别
5. Learning sub-word units and exploiting contextual information for open vocabulary speech recognition. [D] . Parada, Maria Carolina. 2011

机译：学习子词单位并利用上下文信息进行开放式词汇语音识别。
6. SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields [O] . Kai Xu, Zhanfan Zhou, Tao Gong, 2018

机译：SBLC：基于语义双向LSTM和条件随机场的疾病命名实体识别混合模型
7. End-to-End Speech Endpoint Detection Utilizing Acoustic and Language Modeling Knowledge for Online Low-Latency Speech Recognition [O] . Inyoung Hwang, Joon-Hyuk Chang 2020

机译：利用声学和语言建模知识进行在线低延迟语音识别的端到端语音端点检测
8. LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. [R] . Irie, K., Tuske, Z., Alkhouli, T., 2016

机译：LsTm，GRU，公路和一点注意：语音识别中语言建模的经验概述。

High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model

摘要

著录项

相似文献

相关主题

期刊订阅