Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies

Thanassis Perperis; Theodoros Giannakopoulos; Alexandras Makris; Dimitrios I. Kosmopoulos; Sofia Tsekeridou; Stavros J. Perantonis; Sergios Theodoridis

首页> 外文期刊>Expert Systems with Application >Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies

【24h】

Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies

机译：电影中暴力检测的视听处理的多模式和基于本体的融合方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper we present our research results towards the detection of violent scenes in movies, employing advanced fusion methodologies, based on learning, knowledge representation and reasoning. Towards this goal, a multi-step approach is followed: initially, automated audio and visual analysis is performed to extract audio and visual cues. Then, two different fusion approaches are deployed: (i) a multimodal one that provides binary decisions on the existence of violence or not, employing machine learning techniques, (ii) an ontological and reasoning one, that combines the audio-visual cues with violence and multimedia ontologies. The latter reasons out not only the existence of violence or not in a video scene, but also the type of violence (fight, screams, gunshots). Both approaches are experimentally tested, validated and compared for the binary decision problem of violence detection. Finally, results for the violence type identification are presented for the ontological fusion approach. For evaluation purposes, a large dataset of real movie data has been populated.

机译：在本文中，我们介绍了基于先进的融合方法，基于学习，知识表示和推理的电影中暴力场景检测的研究成果。为了实现这一目标，我们采取了多步骤方法：首先，执行自动音频和视频分析以提取音频和视频提示。然后，部署了两种不同的融合方法：（i）多模式方法，使用机器学习技术对是否存在暴力行为提供二元决策，（ii）本体论和推理方法，将视听线索与暴力行为相结合和多媒体本体。后者不仅说明在视频场景中是否存在暴力，而且还说明了暴力的类型（战斗，尖叫，枪声）。对于暴力检测的二元决策问题，这两种方法均经过实验测试，验证和比较。最后，针对本体融合方法给出了暴力类型识别的结果。为了评估的目的，已经填充了真实电影数据的大型数据集。

著录项

来源
《Expert Systems with Application》 |2011年第11期|p.14102-14116|共15页
作者
Thanassis Perperis; Theodoros Giannakopoulos; Alexandras Makris; Dimitrios I. Kosmopoulos; Sofia Tsekeridou; Stavros J. Perantonis; Sergios Theodoridis;
展开▼
作者单位

Dept. of Informatics and Telecommunications, University of Athens, GR 15784, Greece;

Dept. of Informatics and Telecommunications, University of Athens, GR 15784, Greece;

NCSR Demokritos, Inst. of Informatics and Telecommunications, CR 15310, Greece;

NCSR Demokritos, Inst. of Informatics and Telecommunications, CR 15310, Greece;

Athens Information Technology (AIT), 0.8 km Markopoulou Ave., GR 19002 Peania, Athens, Greece;

NCSR Demokritos, Inst. of Informatics and Telecommunications, CR 15310, Greece;

Dept. of Informatics and Telecommunications, University of Athens, GR 15784, Greece;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
violence; movie; multimodal fusion; learning; ontology; knowledge representation; reasoning;

机译：暴力;电影;多模式融合;学习;本体论;知识表示;推理;

相似文献

外文文献
中文文献
专利

1. ROBUST MULTIMODAL PERSON RECOGNITION USING LOW-COMPLEXITY AUDIO-VISUAL FEATURE FUSION APPROACHES [J] . DHAVAL SHAH, KYU J. HAN, SHRIKANTH S. NARAYANAN International journal of semantic computing . 2010,第2期

机译：基于低复杂度视听特征融合方法的鲁棒多模态人员识别
2. Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio-visual signals [J] . Yang Tsung-Hsien, Wu Chung-Hsien, Huang Kun-Yi, Journal of ambient intelligence and humanized computing . 2017,第6期

机译：耦合的基于HMM的多模态融合用于通过诱发的视听信号检测情绪障碍
3. Multimodal Information Fusion of Audiovisual Emotion Recognition Using Novel Information Theoretic Tools [J] . Zhibing Xie, Ling Guan International journal of multimedia data engineering & management . 2013,第4期

机译：基于新型信息理论工具的视听情感识别多模式信息融合
4. Multimodal information fusion and temporal integration for violence detection in movies [C] . Penet, Cedric IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP . 2012

机译：多模式信息融合和时间整合，用于电影中的暴力检测
5. A multimodal sensor fusion architecture for audio-visual speech recognition. [D] . Makkook, Mustapha A. 2007

机译：用于视听语音识别的多模式传感器融合体系结构。
6. Dissociated Roles of the Inferior Frontal Gyrus and Superior Temporal Sulcus in Audiovisual Processing: Top-Down and Bottom-Up Mismatch Detection [O] . Takeshi Uno, Kensuke Kawai, Katsuyuki Sakai, -1

机译：下额回和上颞沟在视听处理中的分离的作用：自上而下和自下而上的不匹配检测
7. Violence detection in hollywood movies by the fusion of visual and mid-level audio cues [O] . Esra Acar, Frank Hopfgartner, Sahin Albayrak 2013

机译：通过视觉和中级音频线索的融合，在好莱坞电影中检测暴力检测

Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅