首页> 外文会议>IEEE Power Engineering Society Winter Meeting, 2001, 2001 >Multistage information fusion for audio-visual speech recognition

【24h】

Multistage information fusion for audio-visual speech recognition

机译：多级信息融合用于视听语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper looks into the information fusion problem in the context of audio-visual speech recognition. Existing approaches to audio-visual fusion typically address the problem in either the feature domain or the decision domain. We consider a hybrid approach that aims to take advantage of both the feature fusion and the decision fusion methodologies. We introduce a general formulation to facilitate information fusion at multiple stages, followed by an experimental study of a set of fusion schemes allowed by the framework. The proposed method is implemented on a real-time audio-visual speech recognition system, and evaluated on connected digit recognition tasks under varying acoustic conditions. The results show that the multistage fusion system consistently achieves lower word error rates than the reference feature fusion and decision fusion systems. It is further shown that removing the audio only channel from the multistage system leads to only minimal degradations in recognition performance while providing a noticeable reduction in computational load.

机译：本文研究了视听语音识别背景下的信息融合问题。现有的视听融合方法通常在特征域或决策域中解决该问题。我们考虑一种旨在利用特征融合和决策融合方法的混合方法。我们介绍了一个通用的公式，以促进多个阶段的信息融合，然后对框架允许的一系列融合方案进行了实验研究。所提出的方法是在实时视听语音识别系统上实现的，并在变化的声学条件下对连接的数字识别任务进行了评估。结果表明，与参考特征融合和决策融合系统相比，多级融合系统始终实现较低的单词错误率。进一步表明，从多级系统中删除纯音频通道只会导致识别性能的最小降低，同时显着降低了计算负荷。

著录项

来源
《IEEE Power Engineering Society Winter Meeting, 2001, 2001 》|2001年|p.1651-1654|共4页
会议地点
作者
Chu S.M.; Libal V.; Marcheret E.; Neti C.; Potamianos G.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Audio-visual feature fusion via deep neural networks for automatic speech recognition [J] . Mohammad Hasan Rahmani, Farshad Almasganj, Seyyed Ali Seyyedsalehi Digital Signal Processing . 2018 ,第期

机译：通过深度神经网络进行视听功能融合，用于自动语音识别
2. Audio-visual feature fusion via deep neural networks for automatic speech recognition [J] . Mohammad Hasan Rahmani, Farshad Almasganj, Seyyed Ali Seyyedsalehi Digital Signal Processing . 2018 ,第期

机译：通过深度神经网络进行视听功能融合，用于自动语音识别
3. Optimum integration weight for decision fusion audio-visual speech recognition [J] . R. Rajavel, P. S. Sathidevi International Journal of Computational Science and Engineering . 2015 ,第1a2期

机译：决策融合视听语音识别的最佳集成权重
4. Multistage information fusion for audio-visual speech recognition [C] . Chu, S.M., Libal, . 2004

机译：多级信息融合用于视听语音识别
5. A multimodal sensor fusion architecture for audio-visual speech recognition. [D] . Makkook, Mustapha A. 2007

机译：用于视听语音识别的多模式传感器融合体系结构。
6. Do gender differences in audio-visual benefit and visual influence in audio-visual speech perception emerge with age? [O] . Magnus Alm, Dawn Behne -1

机译：随着年龄的增长视听利益中的性别差异和视听语音感知中的视觉影响是否会出现？
7. Improved speech recognition using adaptive audio-visual fusion via a stochastic secondary classifier [O] . Simon Lucey, Sridha Sridharan, Vinod Ch 2016

机译：通过随机二级分类器使用自适应视听融合改进语音识别

Multistage information fusion for audio-visual speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅