首页> 外文会议>European conference on speech communication and technology >Comparing Audio- and A-Posteriori-Probability-Based Stream Confidence Measures for Audio-Visual Speech Recognition
【24h】

Comparing Audio- and A-Posteriori-Probability-Based Stream Confidence Measures for Audio-Visual Speech Recognition

机译:基于音频和后验概率的流式对音频视觉语音识别的流置信度

获取原文

摘要

During the fusion of audio and video information for speech recognition, the estimation of the reliability of the noise affected audio channel is crucial to get meaningful recognition results. In this paper we compare two types of reliability measures. One is the use of the statistics of the phoneme a-posteriori probabilities and the other is the analysis of the audio signal itself. We implemented the entropy and the dispersion of the probabilities and, from the audio-based criteria, the so called Voicing Index. To test the criteria a hybrid ANN/HMM audio-visual recognition system was used and 5 different types of noise at 12 SNR levels each were added to the audio signal. The best sigmoidal fit for each criterion between the fusion parameter and the value of the criterion over all noise types and SNR values was performed. The resulting individual errors and the corresponding averaged relative errors are given.
机译:在语音识别的音频和视频信息融合期间,估计受影响的音频信道的可靠性对于获得有意义的识别结果至关重要。在本文中,我们比较两种类型的可靠性措施。一个是使用音素A-Bouthiori概率的统计数据,另一个是对音频信号本身的分析。我们实施了概率和概率的分散,以及从基于音频的标准,所谓的声学索引。为了测试标准,使用混合ANN / HMM视听识别系统,并在12个SNR级别时使用5种不同类型的噪声。对所有噪声类型和SNR值进行融合参数和标准值之间的每个标准的最佳矩形拟合。给出了所得到的单独误差和相应的平均相对误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号