Refinements of regression-based context-dependent modelling of deep neural networks for automatic speech recognition

机译：用于自动语音识别的基于回归的深度神经网络建模的改进

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The data sparsity problem of context-dependent (CD) acoustic modelling of deep neural networks (DNNs) in speech recognition is addressed by using the decision tree state clusters as the training targets. The CD states within a cluster cannot be distinguished during decoding. This problem, referred to as the clustering problem, is not explicitly addressed in the current literature. In our previous work, a regression-based CD-DNN framework was proposed to address both the data sparsity and the clustering problems. This paper investigates several refinements for the regression-based CD-DNN including two more representative state approximation schemes and the incorporation of sequential learning. The two approximations are obtained based on the statistics learned from the training data. Sequential learning is applied to both broad phone DNN detectors and the regression NN. The proposed refinements are evaluated on a broadcast news transcription task. For the cross-entropy systems, the two approximations perform consistently better than our previous work. Consistent performance gain over the corresponding cross-entropy trained systems is also observed for both the baseline CD-DNN and the regression model with sequential learning.

机译：通过使用决策树状态簇作为训练目标，解决了语音识别中深层神经网络（DNN）的上下文相关（CD）声学建模的数据稀疏性问题。在解码期间无法区分群集内的CD状态。这个问题，称为聚类问题，在当前文献中没有明确解决。在我们之前的工作中，提出了一种基于回归的CD-DNN框架来解决数据稀疏性和聚类问题。本文研究了基于回归的CD-DNN的几种改进，包括两个更具代表性的状态逼近方案和顺序学习的结合。这两个近似值是根据从训练数据中学到的统计数据获得的。顺序学习适用于广泛的电话DNN检测器和回归NN。在广播新闻转录任务上对提出的改进方案进行了评估。对于交叉熵系统，两个近似的性能始终优于我们之前的工作。对于基线CD-DNN和具有顺序学习的回归模型，在相应的交叉熵训练的系统上也观察到一致的性能提升。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2014年|3022-3026|共5页
会议地点
作者
Wang Guangsen; Sim Khe Chai;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Articulatory Features; Canonical State Modelling; Context Dependent Modelling; Deep Neural Network; Logistic Regression; Sequential Learning;

机译：发音特征;规范状态建模;上下文相关建模;深度神经网络逻辑回归顺序学习;

相似文献

外文文献
中文文献
专利

1. Regression-Based Context-Dependent Modeling of Deep Neural Networks for Speech Recognition [J] . WANG G., Sim K.C. Audio, Speech, and Language Processing, IEEE Transactions on . 2014,第11期

机译：用于语音识别的基于回归的上下文依赖的深度神经网络建模
2. Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition: A comparison of current training strategies [J] . Cui Xiaodong, Zhang Wei, Finkler Ulrich, IEEE Signal Processing Magazine . 2020,第3期

机译：自动语音识别深神经网络声学模型的分布式训练：当前训练策略的比较
3. Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha [J] . Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu Journal of Signal and Information Processing . 2020,第1期

机译：Chaha非常低于资源语言的多语言深神经网络建模方法对自动语音识别系统的研究
4. Refinements of regression-based context-dependent modelling of deep neural networks for automatic speech recognition [C] . Wang Guangsen, Sim Khe Chai IEEE International Conference on Acoustics, Speech and Signal Processing . 2014

机译：基于回归的上下文依赖性建模的自动语音识别深度神经网络的改进
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT [O] . Doroteo T. Toledano, María Pilar Fernández-Gallego, Alicia Lozano-Diez 2012

机译：基于深度神经网络的自动语音识别的多分辨率语音分析：TIMIT实验
7. Adaptation of context-dependent deep neural networks for automatic speech recognition [O] . Kaisheng Yao, Dong Yu, Frank Seide, 2012

机译：适应上下文相关的深度神经网络以进行自动语音识别

Refinements of regression-based context-dependent modelling of deep neural networks for automatic speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅