...
首页> 外文期刊>International Journal on Computer Science and Engineering >Audio-Visual Based Multi-Sample Fusion to Enhance Correlation Filters Speaker Verification System
【24h】

Audio-Visual Based Multi-Sample Fusion to Enhance Correlation Filters Speaker Verification System

机译:基于视听的多样本融合以增强相关滤波器说话者验证系统

获取原文

摘要

In this study, we propose a novel approach for speaker verification system that uses a spectrogram image as features and Unconstrained Minimum Average Correlation Energy (UMACE) filters as classifiers. Since speech signal is a behavioral signal, the speech data has a tendency not to consistently reproduce due to the change of speaking rates, health, emotional conditions, temperature and humidity. In order to overcome this problem, a modification of UMACE filters architecture is proposed by executing a multi-sample fusion using speech and lipreading data. So as to evaluate the outstanding fusion scheme, five multi-sample fusion strategies, i.e. maximum, minimum, median, average and majority vote are first experimented using the speech signal data. Afterward, the performance of the audio-visual system using the enhanced UMACE filters is then tested. Here, lipreading data is combined to the audio samples pool and the outstanding fusion scheme that found in prior experiment is used as multi-sample fusion scheme. The Digit Database had been used for performance evaluation and the performance up to 99.64% is achieved by using the enhanced UMACE filters for the speech only system which is 6.89% improvement compared with the base line approach. Subsequently, the implementation of the audio-visual system is observed to be significant in order to broaden the PSR score interval between the authentic and imposter data as well as to further improve the performance of audio only system that offer toward a robust verification system.
机译:在这项研究中,我们为说话者验证系统提出了一种新颖的方法,该方法使用频谱图图像作为特征,并使用无约束最小平均相关能量(UMACE)滤波器作为分类器。由于语音信号是行为信号,因此语音数据由于语音速率,健康状况,情绪状况,温度和湿度的变化而具有不一致地再现的趋势。为了克服该问题,提出了通过使用语音和唇读数据执行多样本融合来对UMACE滤波器架构进行修改的提议。为了评估杰出的融合方案,首先使用语音信号数据对五个多样本融合策略进行了试验,即最大,最小,中位数,平均和多数投票。然后,然后使用增强的UMACE滤波器测试视听系统的性能。在这里,将唇读数据组合到音频样本池中,并且将先前实验中发现的出色融合方案用作多样本融合方案。 Digit数据库已用于性能评估,通过对语音系统使用增强的UMACE滤波器,可以将性能提高到99.64%,比基线方法提高了6.89%。随后,为了扩大真实数据和冒名顶替者数据之间的PSR分数间隔,并进一步提高仅提供音频的系统的性能,可以观察到视听系统的实现意义重大,从而可以提供一种可靠的验证系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号