首页> 外文OA文献 >Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

【2h】

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

机译：结合了深度神经网络和深度自动编码器的混响语音识别，并增强了电话类功能

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as b a c k-e n d o f a r e v e r b e r a n t s p e e c h r e c o g n i t i o n s y s t e m, a n d a n o v e l m e t h o d t o i m p r o v e t h e d e r e v e r b e r a t i o n p e r f o r m a n c e of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognition is performed in the back-end using DNN-HMM acoustic models trained on multi-condition data. The system was evaluated through the ASR task in the Reverb Challenge 2014. The DNN-HMM system trained on the multi-condition training set achieved a conspicuously higher word accuracy compared to the MLLR-adapted GMM-HMM system trained on the same data. Furthermore, feature enhancement with the deep autoencoder contributed to the improvement of recognition accuracy especially in the more adverse conditions. While the mapping between reverberant and clean speech in DAE-based dereverberation is conventionally conducted only with the acoustic information, we presume the mapping is also dependent on the phone information. Therefore, we propose a new scheme (pDAE), which augments a phone-class feature to the standard acoustic features as input. Two types of the phone-class feature are investigated. One is the hard recognition result of monophones, and the other is a soft representation derived from the posterior outputs of monophone DNN. The augmented feature in either type results in a significant improvement (7–8 % relative) from the standard DAE.

机译：我们提出了一种在前端采用深度学习的混响语音识别方法以及ab k k-e nd o af e e e e e e e e e e r e e e e r e e e r e e e r e e e r e e e r e e e r在前端，我们采用深度自动编码器（DAE）来增强语音特征参数，并使用在多条件数据上训练的DNN-HMM声学模型在后端执行语音识别。该系统是通过Reverb Challenge 2014中的ASR任务进行评估的。与在相同数据上训练的MLLR适配的GMM-HMM系统相比，在多条件训练集上训练的DNN-HMM系统实现了明显更高的单词准确性。此外，深度自动编码器的功能增强有助于提高识别精度，尤其是在更恶劣的条件下。虽然在基于DAE的混响中混响和清晰语音之间的映射通常仅使用声学信息进行，但我们假设映射也依赖于电话信息。因此，我们提出了一种新方案（pDAE），该方案将电话类功能扩展为标准声学功能作为输入。研究了两种类型的电话类功能。一个是单声道电话的硬识别结果，另一个是从单声道电话DNN的后输出得出的软表示。两种类型的增强功能均比标准DAE显着改善（相对值7-8％）。

著录项

作者
Mimura Masato; Sakai Shinsuke; Kawahara Tatsuya;
展开▼
作者单位

展开▼
年度 2015
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature [J] . Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara EURASIP journal on advances in signal processing . 2015 ,第1期

机译：结合了深度神经网络和深度自动编码器的混响语音识别，并增强了电话类功能
2. Recognition of words from brain-generated signals of speech-impaired people: Application of autoencoders as a neural Turing machine controller in deep neural networks [J] . Boloukian Behzad, Safi-Esfahani Faramarz Neural Networks: The Official Journal of the International Neural Network Society . 2020 ,第期

机译：识别语音障碍的脑生成信号的单词：AutoEncoders在深神经网络中的神经图定型机控制器中的应用
3. Ensemble of jointly trained deep neural network-based acoustic models for reverberant speech recognition [J] . Lee Moa, Lee Jeehye, Chang Joon-Hyuk Digital Signal Processing . 2019 ,第期

机译：混响语音识别的联合训练深神经网络声学模型的集合
4. Deep autoencoders augmented with phone-class feature for reverberant speech recognition [C] . Mimura Masato, Sakai Shinsuke, Kawahara Tatsuya IEEE International Conference on Acoustics, Speech and Signal Processing . 2015

机译：具有电话级功能的深度自动编码器，用于回响语音识别
5. Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks. [D] . Pillai, Suhas Balkrishna. 2017

机译：使用深度神经网络的表情异常语音识别和离线手写识别。
6. Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement [O] . Michelle Gutiérrez-Muñoz, Astryd González-Salazar, Marvin Coto-Jiménez 2020

机译：混合深度神经网络对回响语音增强的评估
7. Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature [O] . Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara 2015

机译：结合了深度神经网络和深度自动编码器的混响语音识别，并增强了电话类功能

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

摘要

著录项

相似文献

相关主题

期刊订阅