【24h】

A Non-Uniform Filterbank for Speaker Recognition

机译:用于说话人识别的非均匀滤波器组

获取原文

摘要

It is known that speaker-specific information is distributed non-uniformly in the frequency domain. Current speaker recognition systems utilize auditory-motivated scales for extracting acoustic features. These scales, however, are not optimised to exploit the spectral distribution of speaker-specific information and hence may not be the optimal choice for speaker recognition. In this paper, the authors studied the distribution of speaker-specific information for Spectral Centroid Frequency feature, and a nonuniform filter bank is proposed to capture the information effectively for spectral centroid feature. The F-ratio and Kullback-Leibler (KL) distance were used to measure distribution of speaker-specific information and it was empirically shown that the KL distance is better than F-ratio in measuring discriminative ability. The proposed filterbank emphasises the high KL distance regions by allocating more filters in those regions. Experimental results showed a relative EER reduction of 8.8% over the Mel-scale filterbank on NIST2006 SRE database.
机译:众所周知,特定于说话者的信息在频域中分布不均匀。当前的说话者识别系统利用听觉动机的尺度来提取声学特征。但是,这些标度并未经过优化以利用说话人特定信息的频谱分布,因此可能不是说话人识别的最佳选择。在本文中,作者研究了频谱质心频率特征的说话人特定信息的分布,并提出了一个非均匀滤波器组来有效地捕获频谱质心特征的信息。 F比率和Kullback-Leibler(KL)距离用于衡量说话人特定信息的分布,并根据经验表明,在测量判别能力方面,KL距离优于F比率。所提出的滤波器组通过在这些区域中分配更多的滤波器来强调高KL距离区域。实验结果表明,在NIST2006 SRE数据库上,相对于梅尔级过滤器组,EER相对降低了8.8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号