On DNN posterior probability combination in multi-stream speech recognition for reverberant environments

机译：混响环境下多流语音识别中的DNN后验概率组合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A multi-stream framework with deep neural network (DNN) classifiers has been applied in this paper to improve automatic speech recognition (ASR) performance in environments with different reverberation characteristics. We propose a room parameter estimation model to determine the stream weights for DNN posterior probability combination with the aim of obtaining reliable log-likelihoods for decoding. The model is implemented by training a multi-layer perceptron to distinguish between various reverberant environments. The method is tested in known and unknown environments against approaches based on inverse entropy and autoencoders, with average relative word error rate improvements of 46% and 29%, respectively, when performing multi-stream ASR in different reverberant situations.

机译：本文应用了具有深度神经网络（DNN）分类器的多流框架，以改善具有不同混响特性的环境中的自动语音识别（ASR）性能。我们提出一种房间参数估计模型，以确定DNN后验概率组合的流权重，目的是获得可靠的对数似然解码。该模型是通过训练多层感知器来区分各种混响环境而实现的。该方法在已知和未知环境中针对基于逆熵和自动编码器的方法进行了测试，当在不同的混响情况下执行多流ASR时，平均相对字错误率分别提高了46％和29％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2017年|5250-5254|共5页
会议地点
作者
Feifei Xiong; Stefan Goetze; Bernd T. Meyer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Speech; Reverberation; Hidden Markov models; Entropy; Training data; Speech recognition;

机译：训练;语音;混响;隐马尔可夫模型;熵;训练数据;语音识别;
入库时间 2022-08-26 15:24:46

相似文献

外文文献
中文文献
专利

1. Robust Front End Processing for Speech Recognition in Reverberant Environments: Utilization of Speech Properties [J] . Rico PETRICK, Xugang LU, Masashi UNOKI, 電子情報通信学会技術研究報告. 音声. Speech . 2008,第142期

机译：混响环境中用于语音识别的鲁棒前端处理：语音属性的利用
2. Robust Front End Processing for Speech Recognition in Reverberant Environments: Utilization of Speech Properties [J] . Rico PETRICK, Xugang LU, Masashi UNOKI, 電子情報通信学会技術研究報告 . 2008,第142期

机译：混响环境中用于语音识别的鲁棒前端处理：语音属性的利用
3. Speech Enhancement Using Multi-channel Post-Filtering with Modified Signal Presence Probability in Reverberant Environment [J] . Xiaofei Wang, Yanmeng Guo, Qiang Fu, Chinese Journal of Electronics . 2016,第3期

机译：混响环境中使用具有修正的信号存在概率的多通道后置滤波进行语音增强
4. On DNN posterior probability combination in multi-stream speech recognition for reverberant environments [C] . Feifei Xiong, Stefan Goetze, Bernd T. Meyer IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：在混响环境中的多流语音识别中的DNN后验概率组合
5. Perceptually inspired signal processing strategies for robust speech recognition in reverberant environments. [D] . Kingsbury, Brian E. D. 1998

机译：感知启发的信号处理策略可在混响环境中实现强大的语音识别。
6. Time course of a perceptual enhancement effect for noise-masked speech in reverberant environments [O] . Eugene Brandewie, Pavel Zahorik -1

机译：混响环境中掩蔽语音的感知增强效果的时程
7. Combination strategy based on relative performance monitoring for multi-stream reverberant speech recognition [O] . Feifei Xiong, Stefan Goetze, Bernd T. Meyer 2017

机译：基于相对性能监测的多流动语音识别的组合策略

On DNN posterior probability combination in multi-stream speech recognition for reverberant environments

摘要

著录项

相似文献

相关主题

期刊订阅