AUDIO-VISUAL SYNCHRONY FOR DETECTION OF MONOLOGUES IN VIDEO ARCHIVES

机译：用于检测视频档案中独白的视听同步

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we present our approach to detect monologues in video shots. A monologue shot is defined as a shot containing a talking person in the video channel with the corresponding speech in the audio channel. Whilst motivated by the TREC 2002 Video Retrieval Track (VT02), the underlying approach of synchrony between audio and video signals are also applicable for voice and face-based biometrics, assessing of lip-synchronization quality in movie editing, and for speaker localization in video. Our approach is envisioned as a two part scheme. We first detect occurrence of speech and face in a video shot. In shots containing both speech and a face, we distinguish monologue shots as those shots where the speech and facial movements are synchronized. To measure the synchrony between speech and facial movements we use a mutual-information based measure. Experiments with the VT02 corpus indicate that using synchrony, the average precision improves by more than 50% relative compared to using face and speech information alone. Our synchrony based monologue detector submission had the best average precision performance (in VT02) amongst 18 different submissions.

机译：在本文中，我们提出了我们在视频镜头中检测独白的方法。独白镜头被定义为包含视频通道中的谈话人的镜头，其中音频通道中的相应语音。虽然由TREC 2002视频检索轨道（VT02）动机，音频和视频信号之间同步的基础方法也适用于语音和面部的生物识别，评估电影编辑中的唇部同步质量，以及视频中的扬声器本地化。我们的方法被设想为两部分方案。我们首先在视频拍摄中检测出现言语和脸部。在包含言论和脸部的镜头中，我们将独白镜头区分开，因为这些镜头可以同步语音和面部运动。为了测量语音和面部运动之间的同步，我们使用基于互信息的衡量标准。 VT02语料库的实验表明，与单独使用面部和语音信息相比，使用同步，平均精度可提高50％以上。我们的同步的独白探测器提交提交最佳的平均精度性能（在VT02中）在18个不同的提交中。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2003年||共4页
会议地点
作者
IEEE;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. "Look who's talking!" Gaze Patterns for Implicit and Explicit Audio-Visual Speech Synchrony Detection in Children With High-Functioning Autism [J] . Grossman Ruth B., Steinhart Erin, Mitchell Teresa, Autism research: official journal of the International Society for Autism Research . 2015,第3期

机译：“看看谁在说话！”高功能自闭症儿童隐性和显性视听语音同步检测的注视模式
2. Audio-visual synchrony increase the saliency of visual direction changes: Evidence from individual differences and probe detection performance [J] . Hauke Meyerhoff, Nina Gehrer Journal of vision . 2015,第12期

机译：视听同步提高了视觉方向变化的显着性：来自个体差异和探头检测性能的证据
3. Synching models with infants: a perceptual-level model of infant audio-visual synchrony detection [J] . Christopher G. Prince, George J. Hollich Cognitive Systems Research . 2005,第1a4期

机译：与婴儿同步模型：婴儿视听同步检测的感知级模型
4. AUDIO-VISUAL SYNCHRONY FOR DETECTION OF MONOLOGUES IN VIDEO ARCHIVES [C] . IEEE IEEE International Conference on Acoustics, Speech, and Signal Processing . 2003

机译：用于检测视频档案中独白的视听同步
5. Discovering audio-visual associations in narrated videos of human activities. [D] . Oezer, Tuna. 2008

机译：在人类活动的叙述视频中发现视听关联。
6. Look who’s talking! Gaze patterns for implicit and explicit audio-visual speech synchrony detection in children with high-functioning autism [O] . Ruth B. Grossman, Erin Steinhart, Teresa Mitchell, -1

机译：看谁正在说话！高自闭症儿童的隐式和显式视听语音同步检测的注视模式
7. AUDIO-VISUAL SYNCHRONY FOR DETECTION OF MONOLOGUES IN VIDEO ARCHIVES [O] . 2008

机译：用于检测视频档案中单体的视听同步

AUDIO-VISUAL SYNCHRONY FOR DETECTION OF MONOLOGUES IN VIDEO ARCHIVES

摘要

著录项

相似文献

相关主题

期刊订阅