首页> 外文会议>IEEE Autotestcon Confernece >Feature space video stream consistency estimation for dynamic stream weighting in audio-visual speech recognition

【24h】

Feature space video stream consistency estimation for dynamic stream weighting in audio-visual speech recognition

机译：特征空间视频流在视听语音识别中动态流加权的一致性估计

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most current audio-visual automatic speech recognition (AVASR) systems use static weights to leverage between audio and visual information during information fusion. State of the art research has led to using audio reliability metrics for dynamically changing the fusion weights in order to successfully improve overall recognition results. So far, however, incorporating visual reliability metrics into these audio reliability metric based systems have not significantly improved performance. We introduce a new approach to this problem by inferring the “consistency” between the audio and visual information and leveraging the existing audio reliability metrics to create a video reliability metric. Our approach is formulated in the extracted feature space and, thus, does not rely on analyzing the actual video signal itself. The framework presented in this work competes with the audio-only reliability metric based systems and shows promise to consistently outperform.

机译：大多数当前的视听自动语音识别（AVASR）系统使用静态权重来利用在信息融合期间的音频和视觉信息之间的利用。最先进的研究导致使用音频可靠性度量来动态地改变融合权重，以便成功提高整体识别结果。然而，到目前为止，将可视可靠性指标纳入这些基于音频可靠性度量的系统，没有显着提高性能。我们通过推断音频和视觉信息之间的“一致性”并利用现有的音频可靠性度量来创建视频可靠性度量来介绍这种问题的新方法。我们的方法在提取的特征空间中配制，因此，不依赖于分析实际的视频信号本身。本工作中提出的框架与基于音频可靠性度量的系统竞争，并显示承诺以始终如一地倾向。

著录项

来源
《IEEE Autotestcon Confernece》|2008年||共4页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP274-53;
关键词
Hidden Markov Models; Speech Recognition; Vector Quantization;

机译：隐马尔可夫模型;语音识别;矢量量化;

相似文献

外文文献
中文文献
专利

1. On Dynamic Stream Weighting for Audio-Visual Speech Recognition [J] . Estellers V., Gurban M., Thiran J.-P. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第4期

机译：动态流加权的视听语音识别
2. Improved features and dynamic stream weight adaption for robust Audio-Visual Speech Recognition framework [J] . Saudi Ali S., Khalil Mahmoud I, Abbas Hazem M. Digital Signal Processing . 2019,第期

机译：用于强大的视听语音语音识别框架的改进功能和动态流重量适应
3. Noise Adaptive Stream Weighting in Audio-Visual Speech Recognition [J] . Martin Heckmann, Fr#233, d#233, EURASIP journal on advances in signal processing . 2002,第11期

机译：视听语音识别中的噪声自适应流加权
4. FEATURE SPACE VIDEO STREAM CONSISTENCY ESTIMATION FOR DYNAMIC STREAM WEIGHTING IN AUDIO-VISUAL SPEECH RECOGNITION [C] . Louis H. Terry, Derek J. Shiell, Aggelos K. Katsaggelos International Conference on Image Processing . 2008

机译：特征空间视频流在视听语音识别中动态流加权的一致性估计
5. Ensemble feature selection for multi-stream automatic speech recognition. [D] . Gelbart, David. 2008

机译：集成特征选择，用于多流自动语音识别。
6. Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling [O] . Ke Wu, Andrea Edwards, Wei Fan, -1

机译：通过动态特征组加权和重要性采样对不平衡数据流进行分类
7. Feature space video stream consistency estimation for dynamic stream weighting in audio-visual speech recognition [O] . Louis H. Terry, Derek J. Shiell, Aggelos K. Katsaggelos 2014

机译：用于视听语音识别中的动态流加权的特征空间视频流一致性估计

Feature space video stream consistency estimation for dynamic stream weighting in audio-visual speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅