首页> 外文期刊>Defence Science Journal >Temporal Pattern Classification using Kernel Methods for Speech Recognition and Speech Emotion Recognition
【24h】

Temporal Pattern Classification using Kernel Methods for Speech Recognition and Speech Emotion Recognition

机译:语音识别和语音情感识别的内核方法时空模式分类

获取原文
获取原文并翻译 | 示例
           

摘要

There are two paradigms for modelling the varying length temporal data namely, modelling the sequences of feature vectors as in the hidden Markov model-based approaches for speech recognition and modelling the sets of feature vectors as in the Gaussian mixture model (GMM)-based approaches for speech emotion recognition. In this paper, the methods using discrete hidden Markov models (DHMMs) in the kernel feature space and string kernel-based SVM classifier for classification of discretised representation of sequence of feature vectors obtained by clustering and vector quantisation in the kernel feature space are presented. The authors then present continuous density hidden Markov models (CDHMMs) in the explicit kernel feature space that use the continuous valued representation of features extracted from the temporal data. The methods for temporal pattern classification by mapping a varying length sequential pattern to a fixed-length sequential pattern and then using an SVM-based classifier for classification are also presented. The task of recognition of spoken letters in E-set, it is possible to build models that use a discretised representation and string kernel SVM based classification and obtain a classification performance better than that of models using the continuous valued representation is demonstrated. For modelling sets of vectors-based representation of temporal data, two approaches in a hybrid framework namely, the score vector-based approach and the segment modelling based approach are presented. In both approaches, a generative model-based method is used to obtain a fixed length pattern representation for a varying length temporal data and then a discriminative model is used for classification. These two approaches are studied for speech emotion recognition task. The segment modelling based approach gives a better performance than the score vector-based approach and the GMM-based classifiers for speech emotion recognition.
机译:有两种用于对可变长度的时态数据进行建模的范式,即像基于隐马尔可夫模型的语音识别方法那样对特征向量的序列进行建模,以及像基于高斯混合模型(GMM)的方法那样对特征向量集进行建模用于语音情感识别。本文提出了在核特征空间中使用离散隐马尔可夫模型(DHMM)和基于字符串核的SVM分类器对通过在核特征空间中进行聚类和向量量化获得的特征向量序列的离散化表示进行分类的方法。然后,作者在显式内核特征空间中呈现连续密度隐藏的马尔可夫模型(CDHMM),该模型使用从时间数据中提取的特征的连续值表示形式。还提出了通过将可变长度顺序模式映射到固定长度顺序模式,然后使用基于SVM的分类器进行分类的时间模式分类方法。在E-set中识别口头字母的任务,可以建立使用离散表示和基于字符串核SVM的分类模型,并获得比使用连续值表示模型更好的分类性能。为了对基于矢量的时间数据表示进行建模,提出了混合框架中的两种方法,即基于得分矢量的方法和基于分段建模的方法。在这两种方法中,都使用基于生成模型的方法来获取可变长度时间数据的固定长度模式表示,然后使用判别模型进行分类。针对语音情感识别任务研究了这两种方法。基于片段建模的方法比基于分数矢量的方法和基于GMM的语音情感识别分类器具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号