Video concept detection by audio-visual grouplets

Wei Jiang; Alexander C. Loui

首页> 外文期刊>International Journal of Multimedia Information Retrieval >Video concept detection by audio-visual grouplets

【24h】

Video concept detection by audio-visual grouplets

机译：视听组检测视频概念

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate general concept classification in unconstrained videos by joint audio-visual analysis. An audio-visual grouplet (AVG) representation is proposed based on analyzing the statistical temporal audio-visual interactions. Each AVG contains a set of audio and visual codewords that are grouped together according to their strong temporal correlations in videos, and the AVG carries unique audio-visual cues to represent the video content. By using the entire AVGs as building elements, video concepts can be more robustly classified than using traditional vocabularies with discrete audio or visual codewords. Specifically, we conduct coarse-level foreground/background separation in both audio and visual channels, and discover four types of AVGs by exploring mixed-and-matched temporal audiovisual correlations among the following factors: visual foreground, visual background, audio foreground, and audio background. All of these types of AVGs provide discriminative audio-visual patterns for classifying various semantic concepts. To effectively use the AVGs for improved concept classification, a distance metric learning algorithm is further developed. Based on the AVG structure, the algorithm uses an iterative quadratic programming formulation to learn the optimal distances between data points according to the large-margin nearest-neighbor setting. Various types of grouplet-based distances can be computed using individual AVGs, and through our distance metric learning algorithm these grouplet-based distances can be aggregated for final classification.We extensively evaluate our method overthe large-scale Columbia consumer video set. Experiments demonstrate that the AVG-based audio-visual representation can achieve consistent and significant performance improvements compared wth other state-of-the-art approaches.

机译：我们通过联合视听分析调查不受约束的视频中的一般概念分类。在分析统计时间视听交互作用的基础上，提出了视听小团体（AVG）表示。每个AVG包含一组音频和视觉代码字，它们根据它们在视频中的强烈时间相关性而被分组在一起，并且AVG带有独特的视听提示来表示视频内容。通过将整个AVG用作构建元素，与使用具有离散音频或视觉代码字的传统词汇相比，视频概念可以得到更可靠的分类。具体来说，我们在音频和视频通道中进行粗略的前景/背景分离，并通过探索以下因素之间的混合和匹配的时间视听关联来发现四种类型的AVG：视觉前景，视觉背景，音频前景和音频背景。所有这些类型的AVG提供用于区分各种语义概念的区分性视听模式。为了有效地将AVG用于改进的概念分类，进一步开发了距离度量学习算法。该算法基于AVG结构，使用迭代二次规划公式来根据大距离最近邻居设置学习数据点之间的最佳距离。可以使用单个AVG来计算各种类型的基于小组的距离，并且通过我们的距离度量学习算法，可以将这些基于小组的距离进行汇总以进行最终分类。我们在大型哥伦比亚消费者视频集上广泛评估了我们的方法。实验表明，与其他最新方法相比，基于AVG的视听表示可以实现一致且显着的性能改进。

著录项

来源
《International Journal of Multimedia Information Retrieval》 |2013年第4期|共16页
作者
Wei Jiang; Alexander C. Loui;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类图书馆学、图书馆事业;
关键词
Video concept detection; Audio-visual grouplet;

机译：视频概念检测;视听组;
入库时间 2022-08-18 10:38:51

相似文献

外文文献
中文文献
专利

1. Video concept detection by audio-visual grouplets [J] . Wei Jiang, Alexander C. Loui International Journal of Multimedia Information Retrieval . 2013,第4期

机译：视听组检测视频概念
2. An automated framework for advertisement detection and removal from sports videos using audio-visual cues [J] . Abeer TOHEED, Ali JAVED, Aun IRTAZA, Frontiers of computer science . 2021,第2期

机译：使用视听线索从体育视频中的广告检测和删除自动化框架
3. An audio-visual human attention analysis approach to abrupt change detection in videos [J] . Yanxiang Chen, Minglong Song, Lixia Xue, Signal processing . 2015,第may期

机译：视听人类注意力分析方法，用于视频中的突然变化检测
4. Audio-Visual Grouplet: Temporal Audio-Visual Interactions for General Video Concept Classification [C] . Wei Jiang, Alexander C. Loui ACM multimedia conference . 2011

机译：视听Grouplet：用于一般视频概念分类的时间视听交互
5. Discovering audio-visual associations in narrated videos of human activities. [D] . Oezer, Tuna. 2008

机译：在人类活动的叙述视频中发现视听关联。
6. Temporal Structure and Complexity Affect Audio-Visual Correspondence Detection [O] . Rachel N. Denison, Jon Driver, Christian C. Ruff 2012

机译：时间结构和复杂性影响视听对应检测
7. Semantic analysis of field sports video using a petri-net of audio-visual concepts [O] . Liang Bai, Lao Songyang, Smeaton Alan F., 2009

机译：基于视听概念的现场体育视频语义分析

Video concept detection by audio-visual grouplets

摘要

著录项

相似文献

相关主题

期刊订阅