【24h】

Effect of Feature Dimension on Classification of Speech Emotions

机译:特征维度对语音情绪分类的影响

获取原文

摘要

This paper analyses both the static and temporal dynamics of the spectral features in classifying speech emotions. Initially, different frame-level spectral techniques such as the Linear Prediction Cepstral Coefficients (LPCC), Perceptual LP coefficients (PLP), and Mel-Frequency Cepstral Coefficients (MFCC) have been examined. Further, these spectral features are extracted using Wavelet Analysis (WA) for a better emotional portrayal. The extracted feature sets remain high-dimensional and overload the recognizer with redundant features, large memory space, and slower response. To alleviate these issues and fetch more discriminating parameters, the applicability of Vector Quantization in clustering the data has been explored. Machine learning algorithms such as the Gaussian Mixture Model (GMM), the Probabilistic Neural Network (PNN), and the Multilayer Perceptron (MLP) have been simulated with the derived feature sets for their effectiveness in classifying speech emotions. While the GMM has been efficient in classifying the frame-level feature dimension, the NN-based classifiers outperform the GMM for low feature dimensions as revealed from our results.
机译:本文分析了语音情绪中谱特征的静态和时间动态。最初,已经检查了不同帧级光谱技术,例如线性预测谱系数(LPCC),感知LP系数(PLP)和熔融频率谱系数(MFCC)。此外,使用小波分析(WA)提取这些光谱特征,以获得更好的情绪化写法。提取的特征集保持高维度并过载具有冗余功能,大的内存空间和较慢的响应。为了缓解这些问题并获取更多鉴别参数,探讨了矢量量化在聚类数据中的适用性已经探讨了。已经模拟了诸如高斯混合模型(GMM),概率神经网络(PNN)和多层Perceptron(MLP)的机器学习算法已经用衍生特征集模拟了它们在分类语音情绪方面的有效性。虽然GMM在分类帧级特征尺寸方面有效,但基于NN的分类器优于我们的结果所揭示的低特征尺寸的GMM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号