VALID: A New Practical Audio-Visual Database, and Comparative Results

机译：有效：新的实用视听数据库和比较结果

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The performance of deployed audio, face, and multi-modal person recognition systems in non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust audio, face, and multi-modal person recognition systems, the new large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy "real world" office scenario with no control on illumination or acoustic noise. In this paper we describe the acquisition and content of the VALID database, consisting of five recording sessions of 106 subjects over a period of one month. Speaker identification experiments using visual speech features extracted from the mouth region are reported. The performance based on the uncontrolled VALID database is compared with that of the controlled XM2VTS database. The best VALID and XM2VTS based accuracies are 63.21% and 97.17% respectively. This highlights the degrading effect of an uncontrolled illumination environment and the importance of this database for deploying real world applications. The VALID database is available to the academic community through http://ee.ucd.ie/validdb/.

机译：在非受控场景中，已部署的音频，面部和多模式人员识别系统的性能通常低于在高度受控的环境中开发的系统的性能。为了促进健壮的音频，面部和多模式人识别系统的开发，在嘈杂的“真实世界”办公场景中，在无人控制的情况下，获取了新的大型且逼真的多模式（视听）VALID数据库照明或声音噪声。在本文中，我们描述了VALID数据库的获取和内容，该数据库由五个记录阶段组成，涵盖了106个主题，历时1个月。报告了使用从嘴巴区域提取的视觉语音特征进行的说话人识别实验。将基于不受控制的VALID数据库的性能与受控制的XM2VTS数据库的性能进行比较。基于VALID和XM2VTS的最佳准确性分别为63.21％和97.17％。这突出了不受控制的照明环境的不良影响，以及该数据库对于部署实际应用程序的重要性。 VALID数据库可通过http://ee.ucd.ie/validdb/向学术界使用。

著录项

来源
《International Conference on Audio- and Video-Based Biometric Person Authentication(AVBPA 2005); 20050720-22; Hilton Rye Town,NY(US)》|2005年|P.777-786|共10页
会议地点 Hilton Rye TownNY(US)
作者
Niall A. Fox; Brian A. OMullane; Richard B. Reilly;
展开▼
作者单位

Dept. of Electronic and Electrical Engineering, University College Dublin, Belfield, Dublin 4, Ireland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Omnidirectional Audio-Visual Talker Localization Based on Dynamic Fusion of Audio-Visual Features Using Validity and Reliability Criteria [J] . Yuki DENDA, Takanobu NISHIURA, Yoichi YAMASHITA IEICE Transactions on Information and Systems . 2008,第3期

机译：基于有效性和可靠性准则的视听特征动态融合的全向视听讲话者定位
2. The validity and value of peer assessment using adaptive comparative judgement in design driven practical education [J] . Niall Seery, Donal Canty, Pat Phelan International Journal of Technology and Design Education . 2012,第2期

机译：基于自适应比较判断的同peer评估在设计驱动型实践教育中的有效性和价值
3. The validity and value of peer assessment using adaptive comparative judgement in design driven practical education [J] . Niall Seery, Donal Canty, Pat Phelan International journal of technology and design education . 2012,第2期

机译：基于适应性比较判断的同peer评估在设计驱动型实践教育中的有效性和价值
4. AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES [C] . Tsang-Long Pao, Wen-Yuan Liao International Conference on Computer Vision Theory and Applications . 2006

机译：用于测试新的视听数据库的视听语音识别系统
5. Comparative Examination of Audio-Visual Rhythmic Processing in Birds and Humans. [D] . Hagmann, Carl Erick. 2013

机译：鸟类和人类的视听节奏处理的比较检查。
6. MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database [O] . Allan Peter Davis, Thomas C. Wiegers, Michael C. Rosenstein, 2012

机译：MEDIC：比较毒物基因组学数据库中使用的实用疾病词汇
7. Omnidirectional Audio-Visual Talker Localization Based on Dynamic Fusion of Audio-Visual Features Using Validity and Reliability Criteria [O] . Y. DENDA, T. NISHIURA, Y. YAMASHITA 2008

机译：基于有效性和可靠性标准的音频视觉功能动态融合的全向音频视觉谈话者定位
8. THE VALIDITY OF PICTORIAL TESTS AND THEIR INTERACTION WITH AUDIO-VISUAL TEACHING METHODS [R] . 1956

机译：图像测试的有效性及其与音视频教学方法的相互作用

VALID: A New Practical Audio-Visual Database, and Comparative Results

摘要

著录项

相似文献

相关主题

期刊订阅