首页> 外文会议>IEEE Power Engineering Society Winter Meeting, 2001, 2001 >Multistage information fusion for audio-visual speech recognition
【24h】

Multistage information fusion for audio-visual speech recognition

机译:多级信息融合用于视听语音识别

获取原文

摘要

The paper looks into the information fusion problem in the context of audio-visual speech recognition. Existing approaches to audio-visual fusion typically address the problem in either the feature domain or the decision domain. We consider a hybrid approach that aims to take advantage of both the feature fusion and the decision fusion methodologies. We introduce a general formulation to facilitate information fusion at multiple stages, followed by an experimental study of a set of fusion schemes allowed by the framework. The proposed method is implemented on a real-time audio-visual speech recognition system, and evaluated on connected digit recognition tasks under varying acoustic conditions. The results show that the multistage fusion system consistently achieves lower word error rates than the reference feature fusion and decision fusion systems. It is further shown that removing the audio only channel from the multistage system leads to only minimal degradations in recognition performance while providing a noticeable reduction in computational load.
机译:本文研究了视听语音识别背景下的信息融合问题。现有的视听融合方法通常在特征域或决策域中解决该问题。我们考虑一种旨在利用特征融合和决策融合方法的混合方法。我们介绍了一个通用的公式,以促进多个阶段的信息融合,然后对框架允许的一系列融合方案进行了实验研究。所提出的方法是在实时视听语音识别系统上实现的,并在变化的声学条件下对连接的数字识别任务进行了评估。结果表明,与参考特征融合和决策融合系统相比,多级融合系统始终实现较低的单词错误率。进一步表明,从多级系统中删除纯音频通道只会导致识别性能的最小降低,同时显着降低了计算负荷。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号