首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >MULTIPLE WINDOWED SPECTRAL FEATURES FOR EMOTION RECOGNITION
【24h】

MULTIPLE WINDOWED SPECTRAL FEATURES FOR EMOTION RECOGNITION

机译:情绪识别的多个窗口频谱特征

获取原文

摘要

MFCC (Mel Frequency Cepstral Coefficients) and PLP (Perceptual linear prediction coefficients) or RASTA-PLP have demonstrated good results whether when they are used in combination with prosodic features as suprasegmental (long-term) information or when used stand-alone as segmental (short-time) information. MFCC and PLP feature parameterization aims to represent the speech parameters in a way similar to how sound is perceived by humans. However, MFCC and PLP are usually computed from a Hamming-windowed periodogram spectrum estimate that is characterized by large variance. In this paper we study the effect of averaging spectral estimates obtained using a set of orthogonal tapers (windows) on emotion recognition performance. The multitaper MFCC and PLP are examined separately as short-time information vectors modeled using Gaussian mixture models (GMMs). When tested on the FAU AIBO spontaneous emotion corpus, a relative improvement ranging from 2.2% to 3.9% for both MFCC and PLP systems is achieved by multiple windowed spectral features compared to single windowed ones.
机译:MFCC(梅尔频率倒谱系数)和PLP(感知线性预测系数),或者RASTA-PLP已经表现出良好的结果是否当它们组合使用韵律特征为超音(长期)信息或使用时独立为节段性(短时间)的信息。 MFCC和PLP特征参数的目的来表示相似的声音如何被人类感知的方式,语音参数。然而,MFCC和PLP通常由一个汉明窗周期图谱估计,其特征在于大的方差来计算。在本文中,我们研究均使用一组情感识别性能正交锥度(窗口)的得到的光谱估计的影响。在多窗口MFCC和PLP作为使用高斯混合模型(的GMM)建模短时信息矢量分别检查。当在FAU AIBO自发情感语料库测试,相对改善相比单一窗口那些范围从2.2%到两个MFCC和PLP系统由多个实现3.9%窗口的光谱特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号