首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Improving deep neural networks for LVCSR using dropout and shrinking structure
【24h】

Improving deep neural networks for LVCSR using dropout and shrinking structure

机译:利用辍学和收缩结构改善LVCSR的深度神经网络

获取原文

摘要

Recently, the hybrid deep neural networks and hidden Markov models (DNN/HMMs) have achieved dramatic gains over the conventional GMM/HMMs method on various large vocabulary continuous speech recognition (LVCSR) tasks. In this paper, we propose two new methods to further improve the hybrid DNN/HMMs model: i) use dropout as pre-conditioner (DAP) to initialize DNN prior to back-propagation (BP) for better recognition accuracy; ii) employ a shrinking DNN structure (sDNN) with hidden layers decreasing in size from bottom to top for the purpose of reducing model size and expediting computation time. The proposed DAP method is evaluated in a 70-hour Mandarin transcription (PSC) task and the 309-hour Switchboard (SWB) task. Compared with the traditional greedy layer-wise pre-trained DNN, it can achieve about 10% and 6.8% relative recognition error reduction for PSC and SWB tasks respectively. In addition, we also evaluate sDNN as well as its combination with DAP on the SWB task. Experimental results show that these methods can reduce model size to 45% of original size and accelerate training and test time by 55%, without losing recognition accuracy.
机译:近来,在各种大词汇量连续语音识别(LVCSR)任务上,混合深度神经网络和隐马尔可夫模型(DNN / HMM)取得了优于常规GMM / HMMs方法的显着进步。在本文中,我们提出了两种新方法来进一步改进混合DNN / HMMs模型:i)使用压差作为前置条件(DAP)在反向传播(BP)之前初始化DNN,以获得更好的识别精度; ii)使用缩小的DNN结构(sDNN),其隐藏层的大小从下到上减小,以减小模型大小并加快计算时间。在70小时的普通话转录(PSC)任务和309小时的总机(SWB)任务中评估了提出的DAP方法。与传统的贪婪分层预训练DNN相比,PSC和SWB任务的相对识别误差分别降低了约10%和6.8%。此外,我们还在SWB任务上评估sDNN及其与DAP的组合。实验结果表明,这些方法可以将模型尺寸减小到原始尺寸的45%,并将训练和测试时间缩短55%,而不会降低识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号