首页> 外文会议> >Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition
【24h】

Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition

机译:模块化神经网络通过广泛的后代利用大型声学环境进行连续语音识别

获取原文
获取外文期刊封面目录资料

摘要

Traditionally, neural networks such as multi-layer perceptrons handle acoustic context by increasing the dimensionality of the observation vector, in order to include information of the neighbouring acoustic vectors, on either side of the current frame. As a result the monolithic network is trained on a high multi-dimensional space. The trend is to use the same fixed-size observation vector across the one network that estimates the posterior probabilities for all phones, simultaneously. We propose a decomposition of the network into modular components, where each component estimates a phone posterior. The size of the observation vector we use, is not fixed across the modularised networks, but rather accounts for the phone that each network is trained to classify. For each observation vector, we estimate very large acoustic context through broad-class posteriors. The use of the broad-class posteriors along with the phone posteriors greatly enhance acoustic modelling. We report significant improvements in phone classification and word recognition on the TIMIT corpus. Our results are also better than the best context-dependent system in the literature.
机译:传统上,诸如多层感知器之类的神经网络通过增加观察矢量的维数来处理声学环境,以便在当前帧的任一侧上包括相邻声学矢量的信息。结果,单片网络在较高的多维空间上训练。趋势是在一个网络上使用相同的固定大小的观察向量,该向量同时估计所有电话的后验概率。我们建议将网络分解为模块化组件,其中每个组件估计一个电话后验。我们使用的观察向量的大小在整个模块化网络中不是固定的,而是考虑了每个网络都经过训练可以分类的电话。对于每个观察向量,我们通过广泛的后验估计非常大的声学环境。广泛使用后验者以及电话后验者极大地增强了声学建模。我们报告了TIMIT语料库在电话分类和单词识别方面的重大改进。我们的结果也优于文献中最佳的上下文相关系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号