A Non-Uniform Filterbank for Speaker Recognition

机译：用于说话人识别的非均匀滤波器组

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is known that speaker-specific information is distributed non-uniformly in the frequency domain. Current speaker recognition systems utilize auditory-motivated scales for extracting acoustic features. These scales, however, are not optimised to exploit the spectral distribution of speaker-specific information and hence may not be the optimal choice for speaker recognition. In this paper, the authors studied the distribution of speaker-specific information for Spectral Centroid Frequency feature, and a nonuniform filter bank is proposed to capture the information effectively for spectral centroid feature. The F-ratio and Kullback-Leibler (KL) distance were used to measure distribution of speaker-specific information and it was empirically shown that the KL distance is better than F-ratio in measuring discriminative ability. The proposed filterbank emphasises the high KL distance regions by allocating more filters in those regions. Experimental results showed a relative EER reduction of 8.8% over the Mel-scale filterbank on NIST2006 SRE database.

机译：众所周知，特定于说话者的信息在频域中分布不均匀。当前的说话者识别系统利用听觉动机的尺度来提取声学特征。但是，这些标度并未经过优化以利用说话人特定信息的频谱分布，因此可能不是说话人识别的最佳选择。在本文中，作者研究了频谱质心频率特征的说话人特定信息的分布，并提出了一个非均匀滤波器组来有效地捕获频谱质心特征的信息。 F比率和Kullback-Leibler（KL）距离用于衡量说话人特定信息的分布，并根据经验表明，在测量判别能力方面，KL距离优于F比率。所提出的滤波器组通过在这些区域中分配更多的滤波器来强调高KL距离区域。实验结果表明，在NIST2006 SRE数据库上，相对于梅尔级过滤器组，EER相对降低了8.8％。

著录项

来源
《Annual conference of the International Speech Communication Association》|2012年|2271-2274|共4页
会议地点
作者
Jia Min Karen Kua; Tharmarajah Thiruvaran; Eliathamby Ambikairajah;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
speaker recognition; F-ratio; Kullback-Leibler distance; Spectral centroid frequency;

机译：说话人识别; F比Kullback-Leibler距离;频谱质心频率;

相似文献

外文文献
中文文献
专利

1. Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation [J] . Hiroshi SEKI, Kazumasa YAMAMOTO, Tomoyosi AKIBA, IEICE transactions on information and systems . 2019,第2期

机译：基于深度神经网络的说话人自适应语音识别的判别学习
2. Role of Linear, Mel and Inverse-Mel Filterbanks in Automatic Recognition of Speech from High-Pitched Speakers [J] . Kathania Hemant Kumar, Shahnawazuddin S., Ahmad Waquar, Circuits, systems, and signal processing . 2019,第10期

机译：线性，梅尔和逆梅尔滤波器组在自动识别高音扬声器语音中的作用
3. The Wavelet and Fourier Transforms in Feature Extraction for Text-Dependent, Filterbank-Based Speaker Recognition [J] . Claude Turner, Anthony Joseph, Murat Aksu, Procedia Computer Science . 2011,第1期

机译：特征提取中的小波和傅立叶变换，用于基于文本的，基于滤波器组的说话人识别
4. A Non-Uniform Filterbank for Speaker Recognition [C] . Jia Min Karen Kua, Tharmarajah Thiruvaran, Eliathamby Ambikairajah INTERSPEECH 2012 . 2012

机译：用于扬声器识别的非均匀滤波器
5. Speech enhancement using a truncated and constrained minimum variance estimator in non-uniform wavelet filterbanks. [D] . Koh, Min-Sung. 2002

机译：在非均匀小波滤波器组中使用截断和约束的最小方差估计器进行语音增强。
6. Revisiting vocal perception in non-human animals: a review of vowel discrimination speaker voice recognition and speaker normalization [O] . Buddhamas Kriengwatana, Paola Escudero, Carel ten Cate 2014

机译：重温非人类动物的声音感知：元音辨别说话人语音识别和说话人正常化的综述
7. Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation [O] . Hiroshi SEKI, Kazumasa YAMAMOTO, Tomoyosi AKIBA, 2019

机译：基于深神经网络中的滤波器层的判别差异学习扬声器适应的语音识别
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

A Non-Uniform Filterbank for Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅