首页> 外国专利> Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech and Speaker Recognition

Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech and Speaker Recognition

机译:稀疏听觉再现内核(SPARK)功能,可增强语音和说话人识别的杂音

摘要

The speech feature extraction algorithm is based on a hierarchical combination of auditory similarity and pooling functions. Computationally efficient features referred to as “Sparse Auditory Reproducing Kernel” (SPARK) coefficients are extracted under the hypothesis that the noise-robust information in speech signal is embedded in a reproducing kernel Hilbert space (RKHS) spanned by overcomplete, nonlinear, and time-shifted gammatone basis functions. The feature extraction algorithm first involves computing kernel based similarity between the speech signal and the time-shifted gammatone functions, followed by feature pruning using a simple pooling technique (“MAX” operation). Different hyper-parameters and kernel functions may be used to enhance the performance of a SPARK based speech recognizer.
机译:语音特征提取算法基于听觉相似性和合并功​​能的分层组合。在语音信号中的噪声鲁棒信息嵌入到由过度完成,非线性和时间跨度覆盖的再现内核希尔伯特空间(RKHS)中的假设下,提取了称为“稀疏听觉再现内核”(SPARK)系数的计算有效特征。移位的伽马通基函数。特征提取算法首先涉及计算语音信号和时移的伽马通函数之间基于核的相似度,然后使用简单的合并技术(“ MAX”操作)对特征进行修剪。不同的超参数和内核功能可以用于增强基于SPARK的语音识别器的性能。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号