A clustering based feature selection method in spectro-temporal domain for speech recognition

Nafiseh Esfandian; Farbod Razzazi; Alireza Behrad

首页> 外文期刊>Engineering Applications of Artificial Intelligence >A clustering based feature selection method in spectro-temporal domain for speech recognition

【24h】

A clustering based feature selection method in spectro-temporal domain for speech recognition

机译：光谱时域中基于聚类的语音识别特征选择方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Spectro-temporal representation of speech has become one of the leading signal representation approaches in speech recognition systems in recent years. This representation suffers from high dimensionality of the features space which makes this domain unsuitable for practical speech recognition systems. In this paper, a new clustering based method is proposed for secondary feature selection/extraction in the spectro-temporal domain. In the proposed representation, Gaussian mixture models (GMM) and weighted K-means (WKM) clustering techniques are applied to spectro-temporal domain to reduce the dimensions of the features space. The elements of centroid vectors and covariance matrices of clusters are considered as attributes of the secondary feature vector of each frame. To evaluate the efficiency of the proposed approach, the tests were conducted for new feature vectors on classification of phonemes in main categories of phonemes in TIMIT database. It was shown that by employing the proposed secondary feature vector, a significant improvement was revealed in classification rate of different sets of phonemes comparing with MFCC features. The average achieved improvements in classification rates of voiced plosives comparing to MFCC features is 5.9% using WKM clustering and 6.4% using GMM clustering. The greatest improvement is about 7.4% which is obtained by using WKM clustering in classification of front vowels comparing to MFCC features.

机译：语音的频谱时态表示已成为近年来语音识别系统中主要的信号表示方法之一。该表示遭受特征空间的高维的困扰，这使得该域不适用于实际的语音识别系统。本文提出了一种新的基于聚类的光谱时域特征选择/提取方法。在提出的表示中，将高斯混合模型（GMM）和加权K均值（WKM）聚类技术应用于光谱时域，以减小特征空间的尺寸。聚类的质心向量和协方差矩阵的元素被视为每个帧的次要特征向量的属性。为了评估该方法的有效性，针对TIMIT数据库中主要音素类别中音素分类的新特征向量进行了测试。结果表明，通过使用拟议的次要特征向量，与MFCC特征相比，不同音素集的分类率显着提高。与WMF聚类相比，与MFCC特征相比，语音爆破音的分类率平均提高了5.9％，而使用GMM聚类则达到了6.4％。与MFCC特征相比，通过在前元音分类中使用WKM聚类获得的最大改进约为7.4％。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2012年第6期|p.1194-1202|共9页
作者
Nafiseh Esfandian; Farbod Razzazi; Alireza Behrad;
展开▼
作者单位

Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch, Tehran, Iran;

Department of Electrical and Computer Engineering, Islamic Azad University, Science and Research Branch, Tehran, Iran;

Faculty of Engineering, Shahed University, Tehran, Iran;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
speech recognition; spectro-temporal model; feature extraction; clustering; gaussian mixture models; weighted k-means;

机译：语音识别;时空模型特征提取;集群高斯混合模型;加权k均值;

相似文献

外文文献
中文文献
专利

1. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于人工耳蜗模型的非线性时空特征在嘈杂情况下的自动语音识别
2. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于嘈杂情况下自动语音识别的基于Cochlear模型的非线性光谱 - 时间特征
3. Data-Driven and Feedback Based Spectro-Temporal Features for Speech Recognition [J] . Sivaram G.S.V.S.Nemala S.K.Mesgarani N.Hermansky H. Signal Processing Letters, IEEE . 2010,第11期

机译：用于语音识别的基于数据驱动和反馈的频谱时态特征
4. A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain [C] . Esfandian Nafiseh, Razzazi Farbod, Behrad Alireza The 16th CSI International Symposium on Artificial Intelligence amp; Signal Processing. . 2012

机译：基于时空聚类时间跟踪的语音识别特征提取方法
5. Array-based Spectro-temporal Masking for Automatic Speech Recognition. [D] . Moghimi, Amir R. 2014

机译：基于阵列的频谱时域掩蔽，用于自动语音识别。
6. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System [O] . Pavol Partila, Miroslav Voznak, Jaromir Tovarek 2015

机译：语音情感识别系统的模式识别方法和特征选择
7. Spectro-temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain [O] . Ganapathy Sriram, Hermansky H., Thomas Samuel 2008

机译：光谱域中使用线性预测的自动语音识别的光谱时特征
8. Comparison of Subspace Feature-Domain Methods for Language Recognition. [R] . Campbell, W. M., Sturim, D. E., Torres-Carrasquillo, P., 2016

机译：用于语言识别的子空间特征域方法比较。

A clustering based feature selection method in spectro-temporal domain for speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅