A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain

机译：基于频谱 - 时域中簇的时间跟踪的语音识别特征提取方法

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, a novel approach is proposed for secondary feature extraction based on clusters tracking in spectro-temporal domain. Because of high dimensionality of the spectro-temporal features space, this domain is unsuitable for practical speech recognition systems. In order to reduce the dimensions of the feature space, weighted K-means (WKM) clustering technique is applied to spectro-temporal domain. The elements of mean vectors and covariance matrices of clusters are considered as the feature vector of each frame. However the cluster locations change gradually over the time. The main approach is based on the idea that the variations in clusters locations should be temporally tracked frame by frame and the parameters of these variations are considered in the extraction of secondary feature vectors of each speech frame. Several models are used to register the clusters in the new coming frame. In addition, a new architecture is proposed to classify the speech frames by a combining classifier using both tracked and non-tracked secondary features. The assessments were conducted for the proposed feature vectors on classification of several subsets of TIMIT database phonemes. Using tracked secondary feature vectors, the result was improved to 77.4% on voiced plosives classification which was relatively 1.8% higher than the results of non-tracked secondary feature vectors. The results on other subsets showed good improvement in classification rate too.

机译：本文基于光谱颞域域的簇跟踪，提出了一种新的方法。由于光谱时间特征空间的高维度，因此该域不适合实用语音识别系统。为了减少特征空间的尺寸，加权K-means（WKM）聚类技术应用于频谱时间域。簇的平均矢量和协方差矩阵的元素被认为是每个帧的特征向量。然而，群集位置随着时间的推移逐渐变化。主要方法是基于概念，即群集位置的变化应该按帧逐时地跟踪帧，并且在每个语音帧的辅助特征向量的提取中考虑这些变化的参数。几种模型用于在新的即将到来的框架中注册群集。另外，建议使用组合分类器使用跟踪和非跟踪的辅助特征来对语音帧分类语音帧。在拟议的特征向量上进行了评估，就若干时的次数数据库音素分类。使用跟踪的辅助特征向量，浊音涂层分类的结果得到了77.4％，比未跟踪的次要特征向量的结果高相对1.8％。其他子集的结果也显示出良好的分类率。

著录项

来源
《International Symposium on Artificial Intelligence and Signal Processing》|2012年||共6页
会议地点
作者
Esfandian Nafiseh; Razzazi Farbod; Behrad Alireza;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. A clustering based feature selection method in spectro-temporal domain for speech recognition [J] . Nafiseh Esfandian, Farbod Razzazi, Alireza Behrad Engineering Applications of Artificial Intelligence . 2012,第6期

机译：光谱时域中基于聚类的语音识别特征选择方法
2. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于人工耳蜗模型的非线性时空特征在嘈杂情况下的自动语音识别
3. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于嘈杂情况下自动语音识别的基于Cochlear模型的非线性光谱 - 时间特征
4. A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain [C] . Esfandian Nafiseh, Razzazi Farbod, Behrad Alireza The 16th CSI International Symposium on Artificial Intelligence amp; Signal Processing. . 2012

机译：基于时空聚类时间跟踪的语音识别特征提取方法
5. Array-based Spectro-temporal Masking for Automatic Speech Recognition. [D] . Moghimi, Amir R. 2014

机译：基于阵列的频谱时域掩蔽，用于自动语音识别。
6. On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition [O] . Juraj Kacur, Boris Puterka, Jarmila Pavlovicova, 2021

机译：语音情感识别中的语音特性和特征提取方法
7. Spectro-temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain [O] . Ganapathy Sriram, Hermansky H., Thomas Samuel 2008

机译：光谱域中使用线性预测的自动语音识别的光谱时特征

A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅