Multimedia event detection with multimodal feature fusion and temporal concept localization

Sangmin Oh; Scott McCloskey; Ilseo Kim; Arash Vahdat; Kevin J. Cannons; Hossein Hajimirsadeghi; Greg Mori; A. G. Amitha Perera; Megha Pandey; Jason J. Corso

首页> 外文期刊>Machine Vision and Applications >Multimedia event detection with multimodal feature fusion and temporal concept localization

【24h】

Multimedia event detection with multimodal feature fusion and temporal concept localization

机译：具有多模式特征融合和时间概念定位的多媒体事件检测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a system for multimedia event detection. The developed system characterizes complex multimedia events based on a large array of multimodal features, and classifies unseen videos by effectively fusing diverse responses. We present three major technical innovations. First, we explore novel visual and audio features across multiple semantic granularities, including building, often in an unsupervised manner, mid-level and high-level features upon low-level features to enable semantic understanding. Second, we show a novel Latent SVM model which learns and localizes discriminative high-level concepts in cluttered video sequences. In addition to improving detection accuracy beyond existing approaches, it enables a unique summary for every retrieval by its use of high-level concepts and temporal evidence localization. The resulting summary provides some transparency into why the system classified the video as it did. Finally, we present novel fusion learning algorithms and our methodology to improve fusion learning under limited training data condition. Thorough evaluation on a large TRECVID MED 2011 dataset showcases the benefits of the presented system.

机译：我们提出了一种多媒体事件检测系统。开发的系统基于大量的多模式特征来表征复杂的多媒体事件，并通过有效地融合各种响应来对看不见的视频进行分类。我们提出了三项主要的技术创新。首先，我们探索跨多个语义粒度的新颖视觉和音频功能，包括通常以无监督的方式在低级功能上构建中级和高级功能以实现语义理解。其次，我们展示了一个新颖的潜在SVM模型，该模型可在混乱的视频序列中学习和定位可区分的高级概念。除了通过现有方法提高检测精度外，它还可以通过使用高级概念和时间证据定位来为每次检索提供唯一的摘要。结果摘要为系统为什么对视频进行分类提供了一定的透明度。最后，我们提出了新颖的融合学习算法和我们的方法，以在有限的训练数据条件下改善融合学习。对大型TRECVID MED 2011数据集的全面评估展示了所提出系统的优势。

著录项

来源
《Machine Vision and Applications》 |2014年第1期|49-69|共21页
作者
Sangmin Oh; Scott McCloskey; Ilseo Kim; Arash Vahdat; Kevin J. Cannons; Hossein Hajimirsadeghi; Greg Mori; A. G. Amitha Perera; Megha Pandey; Jason J. Corso;
展开▼
作者单位

Kitware Inc., Clifton Park, New York, USA;

Honeywell Labs, Minneapolis, USA;

Kitware Inc., Clifton Park, New York, USA;

School of Computing Science, Simon Fraser University, Burnaby, Canada;

School of Computing Science, Simon Fraser University, Burnaby, Canada;

School of Computing Science, Simon Fraser University, Burnaby, Canada;

School of Computing Science, Simon Fraser University, Burnaby, Canada;

Kitware Inc., Clifton Park, New York, USA;

Kitware Inc., Clifton Park, New York, USA;

Department of Computer Science and Engineering, SUNY at Buffalo, Buffalo, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Multimedia; Classification; Machine learning; Fusion;

机译：多媒体;分类;机器学习;融合;

相似文献

外文文献
中文文献
专利

1. Evaluating multimedia features and fusion for example-based event detection [J] . Gregory K. Myers, Ramesh Nallapati, Julien van Hout, Machine Vision and Applications . 2014,第1期

机译：评估多媒体功能和融合以进行基于示例的事件检测
2. Efficient Heuristic Methods for Multimodal Fusion and Concept Fusion in Video Concept Detection [J] . Geng Jie, Miao Zhenjiang, Zhang Xiao-Ping Multimedia, IEEE Transactions on . 2015,第4期

机译：视频概念检测中多模式融合和概念融合的高效启发式方法
3. Large-Scale Concept Detection in Multimedia Data Using Small Training Sets and Cross-Domain Concept Fusion [J] . Diou C.Stephanopoulos G.Panagiotopoulos P.Papachristou C.Dimitriou N.Delopoulos A. Circuits and Systems for Video Technology, IEEE Transactions on . 2010,第12期

机译：使用小型训练集和跨域概念融合的多媒体数据中的大规模概念检测
4. Multimodal feature fusion for robust event detection in web videos [C] . Natarajan Pradeep, Wu Shuang, Vitaladevuni Shiv, Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on . 2012

机译：多模式特征融合可在网络视频中进行可靠的事件检测
5. Multimodal Depression Detection: An Investigation of Features and Fusion Techniques for Automated Systems [D] . Morales, Michelle Renee. 2018

机译：多峰抑制检测：自动化系统的功能和融合技术的研究
6. Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals [O] . Yeongtaek Song, Incheol Kim 2019

机译：利用多峰特征和区域提议检测未修剪视频中的时空行为
7. Evaluating Multimedia Features and Fusion for Example-Based Event Detection [O] . Myers, G.K., Nallapati, R., van Hout, J., 2014

机译：评估多媒体功能和融合以进行基于示例的事件检测

Multimedia event detection with multimodal feature fusion and temporal concept localization

摘要

著录项

相似文献

相关主题

期刊订阅