Long short-term memory recurrent neural network-based acoustic model using connectionist temporal classification on a large-scale training corpus

Donghyun Lee; Minkyu Lim; Hosung Park; Yoseb Kang; Jeong-Sik Park; Gil-Jin Jang; Ji-Hwan Kim

首页> 外文期刊>Communications, China >Long short-term memory recurrent neural network-based acoustic model using connectionist temporal classification on a large-scale training corpus

【24h】

Long short-term memory recurrent neural network-based acoustic model using connectionist temporal classification on a large-scale training corpus

机译：大型训练语料库上使用连接器时间分类的基于长期短期记忆递归神经网络的声学模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model (GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model (HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate (WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.

机译：长短期记忆（LSTM）递归神经网络（RNN）推动了基于高斯混合模型（GMM）的声学模型的巨大改进。但是，这些基于混合方法的模型需要从基于GMM的声学模型获得的强制对齐的隐马尔可夫模型（HMM）状态序列。因此，训练基于GMM的声学模型和基于深度学习的声学模型都需要较长的计算时间。为了解决这个问题，提出了一种使用CTC算法的声学模型。 CTC算法不需要基于GMM的声学模型，因为它不使用强制对齐的HMM状态序列。但是，以前使用CTC进行的基于LSTM RNN的声学模型的研究使用的是小型训练语料库。在本文中，使用CTC的基于LSTM RNN的声学模型在大规模的训练语料库上进行了训练，并对其性能进行了评估。对于干净的语音和嘈杂的语音，所实现的声学模型在单词错误率（WER）方面的性能分别为6.18％和15.01％。这类似于基于混合方法的声学模型的性能。

著录项

来源
《Communications, China》 |2017年第9期|23-31|共9页
作者
Donghyun Lee; Minkyu Lim; Hosung Park; Yoseb Kang; Jeong-Sik Park; Gil-Jin Jang; Ji-Hwan Kim;
展开▼
作者单位

Department of Computer Science and Engineering, Sogang University 35 Baekbeom-ro, Mapo-gu, Seoul, 04107, Republic of Korea;

Department of Computer Science and Engineering, Sogang University 35 Baekbeom-ro, Mapo-gu, Seoul, 04107, Republic of Korea;

Department of Computer Science and Engineering, Sogang University 35 Baekbeom-ro, Mapo-gu, Seoul, 04107, Republic of Korea;

Department of Computer Science and Engineering, Sogang University 35 Baekbeom-ro, Mapo-gu, Seoul, 04107, Republic of Korea;

Department of Information and Communication Engineering, Yeungnam University 280 Daehak-Ro, Gyeongsan, Gyeongbuk, 38541, Republic of Korea;

School of Electronics Engineering, Kyungpook National University 80 Daehakro, Bukgu, Daegu, 41566, Republic of Korea;

Department of Computer Science and Engineering, Sogang University 35 Baekbeom-ro, Mapo-gu, Seoul, 04107, Republic of Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hidden Markov models; Logic gates; Acoustics; Training; Mathematical model; Computer architecture; Speech;

机译：隐马尔可夫模型;逻辑门;声学;训练;数学模型;计算机体系结构;语音;

相似文献

外文文献
中文文献
专利

1. Long short-term memory recurrent neural network architectures for Urdu acoustic modeling [J] . Tehseen Zia, Usman Zahid International journal of speech technology . 2019,第1期

机译：用于Urdu声学建模的长短期记忆递归神经网络架构
2. Long short-term memory recurrent neural network for modeling temporal patterns in long-term power forecasting for solar PV facilities: Case study of South Korea [J] . Jung Yoonhwa, Jung Jaehoon, Kim Byungil, Journal of Cleaner Production . 2020,第Mara20期

机译：长短期记忆递归神经网络，用于对太阳能光伏设施的长期功率预测中的时间模式进行建模：韩国的案例研究
3. Automatic temporal segment detection via bilateral long short-term memory recurrent neural networks [J] . Sun Bo, Cao Siming, He Jun, Journal of electronic imaging . 2017,第2期

机译：通过双边长期短期记忆递归神经网络自动进行时间段检测
4. Syllable-Level Long Short-Term Memory Recurrent Neural Network-based Language Model for Korean Voice Interface in Intelligent Personal Assistants [C] . Donghyun Lee, Hosung Park, Minkyu Lim, IEEE Global Conference on Consumer Electronics . 2019

机译：基于音节水平的长期短期记忆递归神经网络的智能个人助理中韩语语音接口的语言模型
5. Gene expression temporal patterns classification with hierarchical Bayesian neural networks and time lagged recurrent neural networks. [D] . Liang, Yulan. 2003

机译：利用分层贝叶斯神经网络和时滞递归神经网络对基因表达时间模式进行分类。
6. Using a Large-Scale Neural Model of Cortical Object Processing to Investigate the Neural Substrate for Managing Multiple Items in Short-Term Memory [O] . Qin Liu, Antonio Ulloa, Barry Horwitz -1

机译：使用皮质对象处理的大规模神经模型研究用于管理短期记忆中多个项目的神经基质
7. Automatic temporal segment detection via bilateral long short-term memory recurrent neural networks [O] . Bo Sun, Siming Cao, Jun He, 2017

机译：自动时间段检测通过双边长短期内存经常性神经网络

Long short-term memory recurrent neural network-based acoustic model using connectionist temporal classification on a large-scale training corpus

摘要

著录项

相似文献

相关主题

期刊订阅