Multimodal extraction of events and of information about the recording activity in user generated videos

Francesco Cricri; Kostadin Dabov; Igor D. D. Curcio; Sujeet Mate; Moncef Gabbouj

首页> 外文期刊>Multimedia Tools and Applications >Multimodal extraction of events and of information about the recording activity in user generated videos

【24h】

Multimodal extraction of events and of information about the recording activity in user generated videos

机译：对用户生成的视频中的事件和有关录制活动的信息进行多模式提取

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this work we propose methods that exploit context sensor data modalities for the task of detecting interesting events and extracting high-level contextual information about the recording activity in user generated videos. Indeed, most camera-enabled electronic devices contain various auxiliary sensors such as accelerometers, compasses, GPS receivers, etc. Data captured by these sensors during the media acquisition have already been used to limit camera degradations such as shake and also to provide some basic tagging information such as the location. However, exploiting the sensor-recordings modality for subsequent higher-level information extraction such as interesting events has been a subject of rather limited research, further constrained to specialized acquisition setups. In this work, we show how these sensor modalities allow inferring information (camera movements, content degradations) about each individual video recording. In addition, we consider a multi-camera scenario, where multiple user generated recordings of a common scene (e.g., music concerts) are available. For this kind of scenarios we jointly analyze these multiple video recordings and their associated sensor modalities in order to extract higher-level semantics of the recorded media: based on the orientation of cameras we identify the region of interest of the recorded scene, by exploiting correlation in the motion of different cameras we detect generic interesting events and estimate their relative position. Furthermore, by analyzing also the audio content captured by multiple users we detect more specific interesting events. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real live music performances.

机译：在这项工作中，我们提出了利用上下文传感器数据模式来检测有趣事件并提取有关用户生成的视频中的录制活动的高级上下文信息的方法。实际上，大多数支持照相机的电子设备都包含各种辅助传感器，例如加速度计，指南针，GPS接收器等。这些传感器在媒体获取过程中捕获的数据已用于限制照相机的退化（例如抖动）并提供一些基本标签位置等信息。然而，将传感器记录模式用于随后的更高级别的信息提取（例如有趣的事件）一直是相当有限的研究主题，进一步受限于专门的采集设置。在这项工作中，我们展示了这些传感器模式如何允许推断有关每个单独视频记录的信息（相机运动，内容降级）。另外，我们考虑了多摄像机场景，其中有多个用户生成的共同场景的记录（例如，音乐会）。在这种情况下，我们联合分析这些多个视频记录及其关联的传感器模式，以提取记录媒体的高级语义：基于摄像机的方向，我们通过利用相关性来识别记录场景的感兴趣区域在不同摄像机的运动中，我们检测到一般的有趣事件并估计它们的相对位置。此外，通过分析多个用户捕获的音频内容，我们可以检测到更具体的有趣事件。我们表明，所提出的多峰分析方法在真实现场音乐表演中获得的各种录音效果良好。

著录项

来源
《Multimedia Tools and Applications》 |2014年第1期|119-158|共40页
作者
Francesco Cricri; Kostadin Dabov; Igor D. D. Curcio; Sujeet Mate; Moncef Gabbouj;
展开▼
作者单位

Tampere University of Technology, Tampere, Finland;

Tampere University of Technology, Tampere, Finland;

Nokia Research Center, Tampere, Finland;

Nokia Research Center, Tampere, Finland;

Tampere University of Technology, Tampere, Finland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Multimodal; Indexing; Sensor; Video; Multi-camera; Multi-user; Context; Event;

机译：多式联运;索引;传感器;视频;多机位;多用户;上下文;事件;

相似文献

外文文献
中文文献
专利

1. Multimodal Semantics Extraction from User-Generated Videos [J] . Francesco Cricri, Kostadin Dabov, Mikko J. Roininen, Advances in multimedia . 2012,第期

机译：从用户生成的视频中提取多模式语义
2. Multimodal Semantics Extraction from User-Generated Videos [J] . FrancescoCricri, KostadinDabov, Mikko J.Roininen, Advances in multimedia . 2012,第1期

机译：从用户生成的视频中提取多模式语义
3. Sentiment key frame extraction in user-generated micro-videos via low-rank and sparse representation [J] . Gu Xiaowei, Lu Lu, Qiu Shaojian, Neurocomputing . 2020,第Octa14期

机译：通过低级别和稀疏表示，在用户生成的微视频中的情感关键帧提取
4. Multimodal Event Detection in User Generated Videos [C] . Cricri Francesco, Dabov Kostadin, Curcio Igor D.D., 2011 IEEE International Symposium on Multimedia . 2011

机译：用户生成的视频中的多模式事件检测
5. A multimodal approach for the assessment of alexithymia: An evaluation of physiological, behavioral, and self-reported reactivity to a traumatic event-relevant video [D] . Bujarski, Sarah Jo 2012

机译：评估运动障碍的多模态方法：对创伤事件相关视频的生理，行为和自我报告反应的评估
6. The Effects of User Engagements for User and Company Generated Videos on Music Sales: Empirical Evidence From YouTube [O] . JiHye Park, JooSeok Park, JaeHong Park -1

机译：用户参与和公司生成的视频的用户参与度对音乐销售的影响：来自YouTube的经验证据
7. Multimodal Extraction of Events and Information about the Recording Activity in User Generated Videos [O] . Francesco Cricri, Kostadin Dabov, Igor D. D. Curcio, 2013

机译：多模式提取事件和用户生成视频中的录制活动信息

Multimodal extraction of events and of information about the recording activity in user generated videos

摘要

著录项

相似文献

相关主题

期刊订阅