Feature Normalization Using Structured Full Transforms for Robust Speech Recognition

机译：使用结构化完整变换进行特征归一化以实现稳健的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Classical mean and variance normalization (MVN) uses a diagonal transform and a bias vector to normalize the mean and variance of noisy features to reference values. As MVN uses diagonal transform, it ignores correlation between feature dimensions. Although full transform is able to make use of feature correlation, its large amount of parameters may not be estimated reliably from a short observation, e.g. 1 utterance. We propose a novel structured full transform that has the same amount of free parameters as diagonal transform while being able to capture correlation between feature dimensions. The proposed structured transform can be estimated reliably from one utterance by maximizing the likelihood of the normalized features on a reference Gaussian mixture model. Experimental results on Aurora-4 task show that the structured transform produces consistently better speech recognition results than diagonal transform and also outperforms advanced frontend (AFE) feature extractor.

机译：经典的均值和方差归一化（MVN）使用对角线变换和偏差矢量将噪声特征的均值和方差归一化为参考值。由于MVN使用对角线变换，因此它会忽略特征尺寸之间的相关性。尽管全变换能够利用特征相关性，但可能无法通过短时间的观察（例如，观测值）可靠地估计其大量参数。 1种话语。我们提出了一种新颖的结构化完整变换，该变换具有与对角线变换相同数量的自由参数，同时能够捕获特征尺寸之间的相关性。通过最大化参考高斯混合模型上归一化特征的可能性，可以从一种话语可靠地估计所提出的结构化变换。在Aurora-4任务上的实验结果表明，结构化变换产生的语音识别结果始终优于对角线变换，并且性能优于高级前端（AFE）特征提取器。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.700-703|共4页
会议地点
作者
Xiong Xiao; Jinyu Li; Eng Siong Chng; Haizhou Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
robust speech recognition; feature normalization; maximum likelihood; eigen-decomposition;

机译：强大的语音识别;特征归一化;最大似然;本征分解;

相似文献

外文文献
中文文献
专利

1. Temporal Structure Normalization of Speech Feature for Robust Speech Recognition [J] . Xiao X., Chng E. S., Li H. IEEE signal processing letters . 2007,第7期

机译：语音特征的时态结构归一化，用于鲁棒语音识别
2. Subband Feature Statistics Normalization Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition [J] . Jeih-weih Hung, Hao-Teng Fan Signal Processing Letters, IEEE . 2009,第9期

机译：基于离散小波变换的鲁棒语音识别子带特征统计归一化技术
3. Temporal modulation normalization for robust speech feature extraction and recognition [J] . Xugang Lu, Shigeki Matsuda, Masashi Unoki, Multimedia Tools and Applications . 2011,第1期

机译：时间调制归一化，用于鲁棒的语音特征提取和识别
4. Irrelevant variability normalization based HMM training using map estimation of feature transforms for robust speech recognition [C] . Donglai Zhu, Qiang Huo Personal, Indoor and Mobile Radio Communications,2005 IEEE 16th International Symposium on . 2008

机译：基于不相关变异性归一化的HMM训练，使用特征变换的地图估计进行鲁棒的语音识别
5. Duration normalization for robust recognition of spontaneous speech via missing feature methods. [D] . Nedel, Jon P. 2004

机译：持续时间归一化，可通过缺失特征方法对自发语音进行可靠识别。
6. New Features Using Robust MVDR Spectrum of Filtered Autocorrelation Sequence for Robust Speech Recognition [O] . Sanaz Seyedin, Seyed Mohammad Ahadi, Saeed Gazor 2013

机译：使用滤波自相关序列的鲁棒MVDR频谱进行鲁棒语音识别的新功能
7. Cepstral Feature Normalization Methods Using Pole Filtering and Scale Normalization for Robust Speech Recognition [O] . Bo Kyeong Choi, Sung Min Ban, Hyung Soon Kim 2015

机译：抗骨刺特征使用杆滤波和尺度标准化进行规范化方法，用于强大的语音识别
8. Normalized Amplitude Modulation Features for Large Vocabulary Noise- Robust Speech Recognition. [R] . Mitra, V., Franco, H., Graciarena, M., 2012

机译：用于大词汇量噪声 - 鲁棒语音识别的归一化幅度调制特征。

Feature Normalization Using Structured Full Transforms for Robust Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅