首页> 外文会议>2011 IEEE International Conference on Acoustics, Speech and Signal Processing >Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory
【24h】

Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory

机译:非负矩阵分解和长短时记忆对自发语音中非语言事件的定位

获取原文

摘要

Features generated by Non-Negative Matrix Factorization (NMF) have successfully been introduced into robust speech processing, including noise-robust speech recognition and detection of non-linguistic vocalizations. In this study, we introduce a novel tandem approach by integrating likelihood features derived from NMF into Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNNs) in order to dynamically localize non-linguistic events, i. e., laughter, vocal, and non-vocal noise, in highly spontaneous speech. We compare our tandem architecture to a baseline conventional phoneme-HMM-based speech recognizer, and achieve a relative reduction of the frame error rate by 37.5% in the discrimination of speech and different non-speech segments.
机译:非负矩阵分解(NMF)生成的功能已成功引入健壮的语音处理中,包括噪声健壮的语音识别和非语言发声的检测。在这项研究中,我们通过将源自NMF的似然特征集成到双向长期短期记忆递归神经网络(BLSTM-RNN)中,从而动态定位非语言事件,从而引入了一种新颖的串联方法。例如,高度自发的语音中的笑声,人声和非人声噪声。我们将串联架构与基于基线的传统音素-基于HMM的语音识别器进行比较,并在区分语音和不同的非语音段方面实现了37.5%的帧错误率的相对降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号