Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video

机译：发现操纵视频中的视听不一致（SAVI）

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper is part of a larger effort to detect manipulations of video by searching for and combining the evidence of multiple types of inconsistencies between the audio and visual channels. Here, we focus on inconsistencies between the type of scenes detected in the audio and visual modalities (e.g., audio indoor, small room versus visual outdoor, urban), and inconsistencies in speaker identity tracking over a video given audio speaker features and visual face features (e.g., a voice change, but no talking face change). The scene inconsistency task was complicated by mismatches in the categories used in current visual scene and audio scene collections. To deal with this, we employed a novel semantic mapping method. The speaker identity inconsistency process was challenged by the complexity of comparing face tracks and audio speech clusters, requiring a novel method of fusing these two sources. Our progress on both tasks was demonstrated on two collections of tampered videos.

机译：本文是通过搜索并组合音频和视频通道之间多种类型不一致的证据来检测视频操纵的一项较大工作的一部分。在这里，我们着眼于在音频和视觉模态（例如，室内音频，小房间与室外视觉，城市）中检测到的场景类型之间的不一致，以及在给定音频扬声器特征和视觉面部特征的情况下，视频中扬声器身份跟踪的不一致（例如，声音发生变化，但说话的脸没有发生变化）。场景不一致任务由于当前视觉场景和音频场景集合中使用的类别不匹配而变得复杂。为了解决这个问题，我们采用了一种新颖的语义映射方法。说话人身份不一致过程受到比较面部轨迹和音频语音簇的复杂性的挑战，这需要一种融合这两种来源的新颖方法。我们在两个被篡改的视频集合中展示了我们在两项任务上的进展。

著录项

来源
《IEEE Conference on Computer Vision and Pattern Recognition Workshops》|2017年|1907-1914|共8页
会议地点
作者
Robert Bolles; J. Brian Burns; Martin Graciarena; Andreas Kathol; Aaron Lawson; Mitchell McLaren; Thomas Mensink;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Face; Feature extraction; Speech; Acoustics; Semantics; Lips;

机译：可视化;面部;特征提取;语音;声学;语义;嘴唇;

相似文献

外文文献
中文文献
专利

1. Analysis of Audio-Visual Synchronous Patterns in Edited Videos - Towards an Aid for Attractive Video Editing [J] . Naoko NITTA, Noboru BABAGUCHI 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2006,第376期

机译：分析已编辑视频中的视听同步模式-有助于进行有吸引力的视频编辑
2. An automated framework for advertisement detection and removal from sports videos using audio-visual cues [J] . Abeer TOHEED, Ali JAVED, Aun IRTAZA, Frontiers of computer science . 2021,第2期

机译：使用视听线索从体育视频中的广告检测和删除自动化框架
3. Content-Aware Summarization of Broadcast Sports Videos: An Audio-Visual Feature Extraction Approach [J] . Abdullah Aman Khan, Jie Shao, Waqar Ali, Neural processing letters . 2020,第3期

机译：广播运动视频的内容感知摘要：视听特征提取方法
4. Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video [C] . Robert Bolles, J. Brian Burns, Martin Graciarena, IEEE Conference on Computer Vision and Pattern Recognition Workshops . 2017

机译：在操纵视频中发现视听不一致（Savi）
5. Discovering audio-visual associations in narrated videos of human activities. [D] . Oezer, Tuna. 2008

机译：在人类活动的叙述视频中发现视听关联。
6. The Quality of Open-Access Video-Based Orthopaedic Instructional Content for the Shoulder Physical Exam is Inconsistent [O] . Ekaterina Urch, Samuel A. Taylor, Elizabeth Cody, 2016

机译：肩部身体检查的基于开放式视频的骨科教学内容的质量不一致
7. Audio-Visual Model for Generating Eating Sounds Using Food ASMR Videos [O] . Kodai Uchiyama, Kazuhiko Kawamoto 2021

机译：使用食物ASMR视频产生饮食声音的视听模型
8. Saekring Av Viktig Infrastruktur Syntesrapport Fran Forskningsprogrammet SAVI (Assuring Critical Infrastructure Synthesis Report for the Research Program SAVI). [R] . Fischer, G. 2005

机译：saekring av Viktig Infrastruktur syntesrapport Fran Forskningsprogrammet saVI（确保研究项目saVI的关键基础设施综合报告）。

Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video

摘要

著录项

相似文献

相关主题

期刊订阅