DYNAMIC FRAME SKIPPING FOR FAST SPEECH RECOGNITION IN RECURRENT NEURAL NETWORK BASED ACOUSTIC MODELS

机译：基于复发性神经网络声学模型的快速语音识别动态帧

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A recurrent neural network is a powerful tool for modeling sequential data such as text and speech. While recurrent neural networks have achieved record-breaking results in speech recognition, one remaining challenge is their slow processing speed. The main cause comes from the nature of recurrent neural networks that read only one frame at each time step. Therefore, reducing the number of reads is an effective approach to reducing processing time. In this paper, we propose a novel recurrent neural network architecture called Skip-RNN, which dynamically skips speech frames that are less important. The Skip-RNN consists of an acoustic model network and skip-policy network that are jointly trained to classify speech frames and determine how many frames to skip. We evaluate our proposed approach on the Wall Street Journal corpus and show that it can accelerate acoustic model computation by up to 2.4 times without any noticeable degradation in transcription accuracy.

机译：经常性神经网络是一种强大的工具，用于建模顺序数据，如文本和语音。虽然经常性的神经网络已经实现了语音识别的录制效果，但一个剩下的挑战是他们的慢速处理速度。主要原因来自于经常性神经网络的性质，每次只读一个帧。因此，减少读数的数量是减少处理时间的有效方法。在本文中，我们提出了一种名为Skip-RNN的新型复发性神经网络架构，其动态地跳过了不太重要的语音帧。 Skip-RNN由声学模型网络和跳过策略网络组成，这些网络是共同培训的，以分类语音帧并确定跳过多少帧。我们在华尔街日志语料库中评估了我们提出的方法，并表明它可以将声学模型计算加速到2.4倍，而没有任何明显的转录精度下降。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2018年|4454-5088p|共5页
会议地点
作者
Inchul Song; Junyoung Chung; Taesup Kim; Yoshua Bengio;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
recurrent neural networks; neural acoustic models; dynamic frame skipping; policy gradient methods;

机译：经常性的神经网络;神经声学模型;动态帧跳跃;政策梯度方法;

相似文献

外文文献
中文文献
专利

1. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model [J] . He Di, Lim Boon Pang, Yang Xuesong, The Journal of the Acoustical Society of America . 2018,第6aPta1期

机译：声学地标包含与具有深度神经网络声学模型的自动语音识别的其他帧的更多信息
2. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model [J] . He Di, Lim Boon Pang, Yang Xuesong, The Journal of the Acoustical Society of America . 2018,第6aPta2期

机译：声学地标包含与具有深度神经网络声学模型的自动语音识别的其他帧的更多信息
3. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
4. DYNAMIC FRAME SKIPPING FOR FAST SPEECH RECOGNITION IN RECURRENT NEURAL NETWORK BASED ACOUSTIC MODELS [C] . Inchul Song, Junyoung Chung, Taesup Kim, IEEE International Conference on Acoustics, Speech and Signal Processing . 2018

机译：基于复发性神经网络声学模型的快速语音识别动态帧
5. Neural Network Based Representation Learning and Modeling for Speech and Speaker Recognition [D] . Guo, Jinxi. 2019

机译：基于神经网络的语言和扬声器识别的模拟
6. Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition [O] . Saeedeh Hashemnia, Lukas Grasse, Shweta Soni, 2021

机译：人体EEG和经常性神经网络在语音识别期间表现出共同的时间动态
7. On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition [O] . Prabhavalkar, Rohit, Alsharif, Ouais, Bruguier, Antoine, 2016

机译：回归神经网络的压缩及其应用用于嵌入式语音识别的LVCsR声学建模

DYNAMIC FRAME SKIPPING FOR FAST SPEECH RECOGNITION IN RECURRENT NEURAL NETWORK BASED ACOUSTIC MODELS

摘要

著录项

相似文献

相关主题

期刊订阅