首页> 美国卫生研究院文献>other >A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery
【2h】

A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery

机译:强大的说话人识别系统利用听觉外围模型的响应

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the responses of a physiologically-based computational model of the auditory periphery. The responses of auditory-nerve fibers for a wide range of characteristic frequency were simulated to speech signals to construct neurograms. The neurogram coefficients were trained using the well-known Gaussian mixture model-universal background model classification technique to generate an identity model for each speaker. In this study, three text-independent and one text-dependent speaker databases were employed to test the identification performance of the proposed method. Also, the robustness of the proposed method was investigated using speech signals distorted by three types of noise such as the white Gaussian, pink, and street noises with different signal-to-noise ratios. The identification results of the proposed neural-response-based method were compared to the performances of the traditional speaker identification methods using features such as the Mel-frequency cepstral coefficients, Gamma-tone frequency cepstral coefficients and frequency domain linear prediction. Although the classification accuracy achieved by the proposed method was comparable to the performance of those traditional techniques in quiet, the new feature was found to provide lower error rates of classification under noisy environments.
机译:嘈杂条件下的说话人识别是语音处理应用领域中具有挑战性的主题之一。由于神经反应对噪声具有鲁棒性,因此本文提出了一种新的说话人识别系统,该系统使用基于听觉外围的基于生理的计算模型的反应构建的二维神经图。听觉神经纤维对宽范围的特征频率的响应被模拟到语音信号以构造神经图。使用众所周知的高斯混合模型-通用背景模型分类技术训练神经图系数,以为每个说话者生成一个身份模型。在这项研究中,三个独立于文本和一个独立于文本的说话者数据库被用来测试该方法的识别性能。此外,使用由三种类型的噪声(例如具有不同信噪比的高斯白噪声,粉红色噪声和街道噪声)失真的语音信号研究了该方法的鲁棒性。将所提出的基于神经响应的方法的识别结果与传统说话人识别方法的性能进行了比较,这些方法具有梅尔频率倒谱系数,伽玛音频率倒谱系数和频域线性预测等功能。尽管通过所提出的方法实现的分类准确度可与那些传统技术在安静环境下的表现相媲美,但发现该新功能在嘈杂的环境下可提供较低的分类错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号