Comparing Audio- and A-Posteriori-Probability-Based Stream Confidence Measures for Audio-Visual Speech Recognition

机译：基于音频和后验概率的流式对音频视觉语音识别的流置信度

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

During the fusion of audio and video information for speech recognition, the estimation of the reliability of the noise affected audio channel is crucial to get meaningful recognition results. In this paper we compare two types of reliability measures. One is the use of the statistics of the phoneme a-posteriori probabilities and the other is the analysis of the audio signal itself. We implemented the entropy and the dispersion of the probabilities and, from the audio-based criteria, the so called Voicing Index. To test the criteria a hybrid ANN/HMM audio-visual recognition system was used and 5 different types of noise at 12 SNR levels each were added to the audio signal. The best sigmoidal fit for each criterion between the fusion parameter and the value of the criterion over all noise types and SNR values was performed. The resulting individual errors and the corresponding averaged relative errors are given.

机译：在语音识别的音频和视频信息融合期间，估计受影响的音频信道的可靠性对于获得有意义的识别结果至关重要。在本文中，我们比较两种类型的可靠性措施。一个是使用音素A-Bouthiori概率的统计数据，另一个是对音频信号本身的分析。我们实施了概率和概率的分散，以及从基于音频的标准，所谓的声学索引。为了测试标准，使用混合ANN / HMM视听识别系统，并在12个SNR级别时使用5种不同类型的噪声。对所有噪声类型和SNR值进行融合参数和标准值之间的每个标准的最佳矩形拟合。给出了所得到的单独误差和相应的平均相对误差。

著录项

来源
《European conference on speech communication and technology》|2001年||共4页
会议地点
作者
Martin Heckmann; Thorsten Wild; Frederic Berthommier; Kristian Kroschel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类传播理论;
关键词

相似文献

外文文献
中文文献
专利

1. Improved features and dynamic stream weight adaption for robust Audio-Visual Speech Recognition framework [J] . Saudi Ali S., Khalil Mahmoud I, Abbas Hazem M. Digital Signal Processing . 2019,第期

机译：用于强大的视听语音语音识别框架的改进功能和动态流重量适应
2. Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition [J] . Abdelaziz Ahmed Hussen, Zeiler Steffen, Kolossa Dorothea Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第5期

机译：学习动态流权重，用于基于耦合HMM的视听语音识别
3. On Dynamic Stream Weighting for Audio-Visual Speech Recognition [J] . Estellers V., Gurban M., Thiran J.-P. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第4期

机译：动态流加权的视听语音识别
4. Comparing Audio- and A-Posteriori-Probability-Based Stream Confidence Measures for Audio-Visual Speech Recognition [C] . Martin Heckmann, Thorsten Wild, Frederic Berthommier, European conference on speech communication and technology . 2001

机译：基于音频和后验概率的流式对音频视觉语音识别的流置信度
5. Assessment of a measure of response confidence for a speech recognition task in noise. [D] . Dundas, John Andrew. 2009

机译：评估语音识别任务在噪声中的响应置信度。
6. Do gender differences in audio-visual benefit and visual influence in audio-visual speech perception emerge with age? [O] . Magnus Alm, Dawn Behne -1

机译：随着年龄的增长视听利益中的性别差异和视听语音感知中的视觉影响是否会出现？
7. USING LIKELIHOOD L-STATISTICS TO MEASURE CONFIDENCE IN AUDIO-VISUAL SPEECH RECOGNITION [O] . Arpita Ghosh, Ashish Verma, A Sarkar 2008

机译：在语音视觉识别中使用类L统计量来测量置信度
8. Developing Multi-Voice Speech Recognition Confidence Measures and Applying Them to AHLTA-Mobile [R] . Gadbois, G. J. 2011

机译：开发多语音语音识别信任度量并将其应用于aHLTa-mobile

Comparing Audio- and A-Posteriori-Probability-Based Stream Confidence Measures for Audio-Visual Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅