首页> 外文会议> >Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition

【24h】

Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition

机译：模块化神经网络通过广泛的后代利用大型声学环境进行连续语音识别

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Traditionally, neural networks such as multi-layer perceptrons handle acoustic context by increasing the dimensionality of the observation vector, in order to include information of the neighbouring acoustic vectors, on either side of the current frame. As a result the monolithic network is trained on a high multi-dimensional space. The trend is to use the same fixed-size observation vector across the one network that estimates the posterior probabilities for all phones, simultaneously. We propose a decomposition of the network into modular components, where each component estimates a phone posterior. The size of the observation vector we use, is not fixed across the modularised networks, but rather accounts for the phone that each network is trained to classify. For each observation vector, we estimate very large acoustic context through broad-class posteriors. The use of the broad-class posteriors along with the phone posteriors greatly enhance acoustic modelling. We report significant improvements in phone classification and word recognition on the TIMIT corpus. Our results are also better than the best context-dependent system in the literature.

机译：传统上，诸如多层感知器之类的神经网络通过增加观察矢量的维数来处理声学环境，以便在当前帧的任一侧上包括相邻声学矢量的信息。结果，单片网络在较高的多维空间上训练。趋势是在一个网络上使用相同的固定大小的观察向量，该向量同时估计所有电话的后验概率。我们建议将网络分解为模块化组件，其中每个组件估计一个电话后验。我们使用的观察向量的大小在整个模块化网络中不是固定的，而是考虑了每个网络都经过训练可以分类的电话。对于每个观察向量，我们通过广泛的后验估计非常大的声学环境。广泛使用后验者以及电话后验者极大地增强了声学建模。我们报告了TIMIT语料库在电话分类和单词识别方面的重大改进。我们的结果也优于文献中最佳的上下文相关系统。

著录项

来源
《》|2001年|P.505-508|共4页
会议地点
作者
Antoniou; C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
2. Modular Construction of Time-Delay Neural Networks for Speech Recognition [J] . Waibel A Neural computation . 1989,第1期

机译：语音识别时延神经网络的模块化构建
3. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model [J] . He Di, Lim Boon Pang, Yang Xuesong, The Journal of the Acoustical Society of America . 2018,第6aPta1期

机译：声学地标包含与具有深度神经网络声学模型的自动语音识别的其他帧的更多信息
4. Modular neural networks exploit large acoustic context through broad-class posteriors for continuous recognition [C] . Christos Antoniou IEEE International Conference on Acoustics, Speech, and Signal Processing . 2001

机译：模块化神经网络通过广泛的外文来利用大型声学背景以进行连续识别
5. Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks. [D] . Pillai, Suhas Balkrishna. 2017

机译：使用深度神经网络的表情异常语音识别和离线手写识别。
6. Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT [O] . Doroteo T. Toledano, María Pilar Fernández-Gallego, Alicia Lozano-Diez 2012

机译：基于深度神经网络的自动语音识别的多分辨率语音分析：TIMIT实验
7. Phonetically Motivated Acoustic Parameters For Continuous Speech Recognition Using Artificial Neural Networks [O] . Yoshua Bengio, Renato De Mori, Giovanni Flammia, 1992

机译：用人工神经网络进行连续语音识别的语音激励声学参数
8. Modular Neural Networks for Speech Recognition [R] . Fritsch, J. 1996

机译：用于语音识别的模块化神经网络

Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅