Object Category Recognition Using Probabilistic Fusion of Speech and Image Classifiers

机译：对象类别识别使用语音和图像分类器的概率融合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multimodal scene understanding is an integral part of human-robot interaction (HRI) in situated environments. Especially useful is category-level recognition, where the the system can recognize classes of objects of scenes rather than specific instances (e.g., any chair vs. this particular chair.) Humans use multiple modalities to understand which object category is being referred to, simultaneously interpreting gesture, speech and visual appearance, and using one modality to disambiguate the information contained in the others. In this paper, we address the problem of fusing visual and acoustic information to predict object categories, when an image of the object and speech input from the user is available to the HRI system. Using probabilistic decision fusion, we show improved classification rates on a dataset containing a wide variety of object categories, compared to using either modality alone.

机译：多式联运场景理解是位于环境环境中的人机交互（HRI）的一个组成部分。特别有用的是类别级别识别，其中系统可以识别场景对象的类而不是特定的实例（例如，任何椅子与此特定椅子。）人类使用多种模态来了解哪个对象类别同时参考哪个对象类别解释手势，语音和视觉外观，并使用一种模态来消除其他方式包含在其他方式中的信息。在本文中，我们解决了融合视觉和声学信息以预测对象类别的问题，当来自用户的对象的图像和来自用户的语音输入时，可以使用来自用户的图像。使用概率决策融合，我们在包含多种对象类别的数据集上显示了改进的分类速率，而单独使用任何一种模态。

著录项

来源
《International Workshop on Machine Learning for Multimodal Interaction》|2008年||共12页
会议地点
作者
Kate Saenko; Trevor Darrell;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Multimodal fusion; Object recognition; Human-computer interaction;

机译：多模式融合;对象识别;人机互动;

相似文献

外文文献
中文文献
专利

1. Classifying fusion categories circle times-generated by an object of small Frobenius-Perron dimension [J] . Selecta mathematica . 2020,第2期

机译：分类融合类别圈时分由小型Frobenius-Perron维度的对象产生
2. A Novel Algorithm for Acoustic and Visual Classifiers Decision Fusion in Audio-Visual Speech Recognition System [J] . P.S. Sathidevi, Rajavel Signal Processing: An International Journal . 2010,第1期

机译：视听语音识别系统中声，视觉分类器决策融合的新算法
3. Automated Evolutionary Design of CNN Classifiers for Object Recognition on Satellite Images [J] . Iana S. Polonskaia, Ilya R. Aliev, Nikolay O. Nikitin Procedia Computer Science . 2021,第a期

机译：卫星图像对象识别的CNN分类器自动化设计
4. Object Category Recognition Using Probabilistic Fusion of Speech and Image Classifiers [C] . Kate Saenko, Trevor Darrell International Workshop on Machine Learning for Multimodal Interaction;MLMI 2008 . 2008

机译：使用语音和图像分类器的概率融合进行对象类别识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Classifying four-category visual objects using multiple ERP components in single-trial ERP [O] . Yu Qin, Yu Zhan, Changming Wang, 2016

机译：使用单次ERP中的多个ERP组件对四类视觉对象进行分类
7. Unifying discriminative visual codebook generation with classifier training for object category recognition [O] . Yang, Liu, Jin, Rong, Sukthankar, Rahul, 2008

机译：通过分类器训练统一可区分的可视代码簿生成，以进行对象类别识别

Object Category Recognition Using Probabilistic Fusion of Speech and Image Classifiers

摘要

著录项

相似文献

相关主题

期刊订阅