首页> 外文会议>INTERSPEECH 2012 >The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear

【24h】

The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear

机译：当扬声器的脸被面部遮挡时，“视听面部封面语料库”：调查视听语音和扬声器识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Audio-Visual Face Cover Corpus consists of high-quality audio and video recordings of 10 native British English speakers wearing different types of 'facewear'. Speakers read aloud a set of 64/C_1VC_2/ syllables embedded in a carrier phrase. 18 English consonants occurred twice each in onset and coda positions. Speakers recited the list 1+8 times, i.e. once in control condition (no facewear) and eight times while wearing a forensicallyr relevant face covering. Audio recordings were made by simultaneously capturing the speech via a headband microphone and two shotgun microphones placed facing and behind the speaker. Footage of the subject's head and shoulders was filmed from two camera angles, frontal and half-profile. In total, 6,120 utterances were recorded per device. This paper aims to specify the database design, to introduce forensic-phonetic research utilising the data, and to demonstrate the corpus's potential applications in related fields of study and in casework conducted by forensic speech scientists.

机译：视听面盖语料库由高质量的英国英语扬声器的高质量音频和录像组成，穿着不同类型的“面部衣”。扬声器大声朗读一组64 / c_1vc_2 / syllables嵌入在运营商短语中。 18英文辅音在发行和CODA位置中发生两次。扬声器叙述了1 + 8次，即控制条件（无面部）和八次，同时穿着不采用的相关面孔覆盖。通过通过头带麦克风同时捕获语音和放置在扬声器后面的两个霰弹枪麦克风来进行录音。受试者的头部和肩部的镜头由两个相机角度，正面和半平面拍摄。总共记录了6,120个话语。本文旨在指定数据库设计，以利用数据来引入法医语音研究，并展示Corpus在相关研究领域和法医语音科学家进行的案例中的应用。

著录项

来源
《INTERSPEECH 2012》|2012年||共4页
会议地点
作者
Natalie Fecher;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 73.4136083;
关键词
speech database; audio-visual; forensic speech science; facewear; disguise; acoustic phonetics; perception;

机译：语音数据库;视听;法医语音科学;脸衣;伪装;声语音;感知;
入库时间 2022-08-20 22:09:19

相似文献

外文文献
中文文献
专利

1. Integration strategies for audio-visual speech processing: applied to text-dependent speaker recognition [J] . Lucey S., Chen T., Sridharan S., IEEE transactions on multimedia . 2005,第3期

机译：视听语音处理的集成策略：应用于与文本相关的说话人识别
2. An audio-visual corpus for speech perception and automatic speech recognition (L) [J] . Cooke M, Barker J, Cunningham S, The Journal of the Acoustical Society of America . 2006,第5期

机译：用于语音感知和自动语音识别的视听语料库（L）
3. Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation [J] . Ephrat Ariel, Mosseri Inbar, Lang Oran, ACM Transactions on Graphics . 2018,第4CD期

机译：期待听鸡尾酒会：独立于演讲者的视听模型，用于语音分离
4. The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear [C] . Natalie Fecher Annual conference of the International Speech Communication Association . 2012

机译：“视听面罩语料库”：当说话人的脸被面部服饰遮挡时，进行视听语音和说话人识别的调查
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. Speaker independent audio-visual continuous speech recognition [O] . A.V. Nefian -1

机译：扬声器独立视听连续语音识别
8. Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004 [R] . Martin, A., Miller, D., Przybocki, M., 2004

机译：2004年NIsT演讲者认可评估的会话电话语音语料库集

The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear

摘要

著录项

相似文献

相关主题

期刊订阅