M ultimodal fusion of polynomial classifiers for automatic person recognition

机译：多项式分类器的多模态融合用于自动人员识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current and evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals (in all environments) without encumbering the user with a head-mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multimodal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within a Markov random field (MRF) framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with AWGN (in the audio domain) over a range of signal-to-noise ratios.

机译：随着信息时代的到来，隐私和个性化已成为当今社会的最前沿。因此，生物识别被视为当前和发展中的技术系统的基本组成部分。消费者要求使用非侵入性和非侵入性的方法。在我们以前的工作中，我们演示了符合这些条件的扬声器验证系统。但是，现场系统还有其他限制。所需的识别交易通常在不利的环境中和不同人群中进行，因此需要可靠的解决方案。当前一代的说话者验证系统有两个重要的问题领域。首先是难以获得清晰的音频信号（在所有环境中）而又不会给用户带来头戴式近距离麦克风的麻烦。其次，单峰生物识别系统无法在很大比例的人口中使用。为了解决这些问题，正在研究多峰技术，以提高系统对环境条件的鲁棒性，并提高整个人群的整体准确性。我们提出了一种多模式方法，该方法建立在我们当前最先进的扬声器验证技术的基础上。为了保持语音界面的透明性，我们专注于光学感测技术以提供其他模式，从而为我们提供了视听人员识别系统。对于音频领域，我们使用现有的扬声器验证系统。对于视觉领域，我们专注于嘴唇运动。选择此项而不是选择静态面部或虹膜识别是因为它提供有关个人的动态信息。另外，嘴唇动力学可以帮助语音识别以提供活力测试。视觉处理方法结合了颜色和边缘信息，并在马尔可夫随机场（MRF）框架内进行组合，以定位嘴唇。提取几何特征并将其输入到用于人识别过程的多项式分类器中。采用基于概率模型的后期集成方法来组合这两种模式。该系统已在XM2VTS数据库和AWGN（在音频域中）组合的一系列信噪比上进行了测试。

著录项

来源
《Conference on Applications and Science of Computational Intelligence Ⅳ Apr 17-18, 2001, Orlando, USA》|2001年|p.166-174|共9页
会议地点 Orlando FL(US)
作者
Charles C. Broun; Xiaozheng Zhang;
展开▼
作者单位

Motorola Labs -Human Interface Lab, Phoenix, Arizona;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
multimodal fusion; polynomial classifier; active shape model; markov random field; lip tracking; speech recognition; speaker verification;

机译：多峰融合多项式分类器活动形状模型；马可夫随机场嘴唇跟踪语音识别;说话人验证;

相似文献

外文文献
中文文献
专利

1. Robust Biometric Person Identification Using Automatic Classifier Fusion of Speech, Mouth, and Face Experts [J] . Fox N.A., Gross R., Cohn J.F., IEEE transactions on multimedia . 2007,第4期

机译：使用语音，嘴巴和面部专家的自动分类器融合对生物特征进行可靠的识别
2. Automatic modulation classification based on high order cumulants and hierarchical polynomial classifiers [J] . Ameen Abdelmutalab, Khaled Assaleh, Mohamed El-Tarhuni Physical Communication . 2016,第deca期

机译：基于高阶累积量和分层多项式分类器的自动调制分类
3. Person re-identification using kNN classifier-based fusion approach [J] . E. Poongothai, A. Suruliandi International journal of advanced intelligence paradigms . 2020,第2期

机译：使用基于KNN分类器的融合方法重新识别
4. M ultimodal fusion of polynomial classifiers for automatic person recognition [C] . Xiaozheng Zhang, Charles C. Broun Conference on applications and science of computational intelligence . 2001

机译：M ultimodal融合多项式分类器自动识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Classifier Level Fusion of Accelerometer and sEMG Signals for Automatic Fitness Activity Diarization [O] . Giorgio Biagetti, Paolo Crippa, Laura Falaschetti, 2018

机译：加速度计和sEMG信号的分类器级融合可自动进行健身活动分类
7. Robust Biometric Person Identification Using Automatic Classifier Fusion of Speech, Mouth, and Face Experts [O] . N.A. Fox, R. Gross, J.F. Cohn, 2007

机译：使用自动分类器融合的语音，嘴巴和面部专家的强大生物识别人识别
8. Composite Classifiers for Automatic Target Recognition [R] . Wang, L. , Der, S. , Nasrabadi, N. M. 1998

机译：用于自动目标识别的复合分类器

M ultimodal fusion of polynomial classifiers for automatic person recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅