Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition

机译：频谱时域接收场和MFCC平衡特征提取用于嘈杂的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper aims to propose a new set of acoustic features based on spectral-temporal receptive fields (STRFs). The STRF is an analysis method for studying physiological model of the mammalian auditory system in spectral-temporal domain. It has two different parts: one is the rate (in Hz) which represents the temporal response and the other is the scale (in cycle/octave) which represents the spectral response. With the obtained STRF, we propose an effective acoustic feature. First, the energy of each scale is calculated from the STRF. The logarithmic operation is then imposed on the scale energies. Finally, the discrete Cosine transform is applied to generate the proposed STRF feature. In our experiments, we combine the proposed STRF feature with conventional Mel frequency cepstral coefficients (MFCCs) to verify its effectiveness. In a noise-free environment, the proposed feature can increase the recognition rate by 17.48%. Moreover, the increase in the recognition rate ranges from 5% to 12% in noisy environments.

机译：本文旨在基于频谱-时间接收场（STRFs）提出一套新的声学特征。 STRF是一种在频谱时域范围内研究哺乳动物听觉系统生理模型的分析方法。它有两个不同的部分：一个是表示时间响应的速率（以Hz为单位），另一个是表示频谱响应的标度（以周期/倍频程为单位）。利用获得的STRF，我们提出了一种有效的声学特征。首先，从STRF计算每个标度的能量。然后将对数运算施加到标尺能量上。最后，将离散余弦变换应用于生成建议的STRF特征。在我们的实验中，我们将建议的STRF功能与常规的梅尔频率倒谱系数（MFCC）结合起来以验证其有效性。在无噪声的环境中，提出的功能可以将识别率提高17.48％。此外，在嘈杂的环境中，识别率的提高范围从5％到12％。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2014年|1-4|共4页
会议地点
作者
Jia-Ching Wang; Chang-Hong Lin; En-Ting Chen; Pao-Chi Chang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
discrete cosine transforms; feature extraction; speech recognition; MFCC balanced feature extraction; STRF feature; acoustic features; conventional MFCC; conventional Mel frequency cepstral coefficients; discrete cosine transform; logarithmic operation; mammalian auditory system; noise-free environment; noisy speech recognition; physiological model; recognition rate; scale energies; spectral response; spectral-temporal domain; spectral-temporal receptive fields; temporal response; Decision support systems; Mel frequency cepstral coefficient; Physiology; Speech; Speech processing; Mel frequency cepstral coefficients; spectral-temporal receptive fields; speech recognition;

机译：离散余弦变换;特征提取;语音识别; MFCC平衡特征提取; STRF特征;声学特征;常规MFCC;常规Mel频率倒谱系数;离散余弦变换;对数运算;哺乳动物听觉系统;无噪声环境;嘈杂的语音识别;生理模型;识别率;尺度能量;谱响应;谱时域;谱时接受域;时域响应;决策支持系统;梅尔倒谱系数;生理学;语音;语音处理;梅尔倒谱系数;谱时空接受场;语音识别;

相似文献

外文文献
中文文献
专利

1. Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition [J] . Wang Jia-Ching, Wang Chien-Yao, Chin Yu-Hao, Multimedia Tools and Applications . 2017,第3期

机译：频谱时域接收场和MFCC平衡特征提取，可增强说话人识别能力
2. Robust Speech Recognition System Using Conventional and Hybrid Features of MFCC, LPCC, PLP, RASTA-PLP and Hidden Markov Model Classifier in Noisy Conditions [J] . Veton Z. K?puska, Hussien A. Elharati Journal of Computer and Communications . 2015,第6期

机译：噪声条件下使用MFCC，LPCC，PLP，RASTA-PLP和隐马尔可夫模型分类器的常规和混合特征的鲁棒语音识别系统
3. Identification of Noisy Speech Signals using Bispectrum-based 2D-MFCC and Its Optimization through Genetic Algorithm as a Feature Extraction Subsystem [J] . BENYAMIN KUSUMOPUTRO, AGUS BUONO, LINA WSEAS Transactions on Computers . 2012,第7a9期

机译：基于双谱的二维MFCC识别语音噪声信号及其作为特征提取子系统的遗传算法优化
4. Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition [C] . Jia-Ching Wang, Chang-Hong Lin, En-Ting Chen, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2014

机译：用于噪声语音识别的光谱 - 时间接收领域和MFCC平衡特征提取
5. A speech recognition IC with an efficient MFCC extraction algorithm and multi-mixture models. [D] . Han, Wei. 2006

机译：具有高效MFCC提取算法和多混合模型的语音识别IC。
6. Spectral-Temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds [O] . Frédéric E. Theunissen, Kamal Sen, Allison J. Doupe 2000

机译：使用自然声音获得的非线性听觉神经元的光谱-时间感受野
7. MSP-MFCC: Energy-Efficient MFCC Feature Extraction Method With Mixed-Signal Processing Architecture for Wearable Speech Recognition Applications [O] . Qin Li, Yuze Yang, Tianxiang Lan, 2020

机译：MSP-MFCC：节能MFCC功能提取方法，具有用于可佩戴式语音识别应用的混合信号处理架构

Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅