首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN
【2h】

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN

机译:SVM与DBN结合使用中文语音进行智能情感服务的情感识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed.
机译:语音中的准确情感识别对于智能医疗保健,智能娱乐和其他智能服务等应用非常重要。由于中文的复杂性,从中文语音中获得高精度的情感识别具有挑战性。本文探讨了如何提高语音情感识别的准确性,包括语音信号特征提取和情感分类方法。从语音样本中提取五种类型的特征:梅尔频率倒谱系数(MFCC),音调,共振峰,短期过零率和短期能量。通过将统计特征与深度信仰网络(DBN)提取的深度特征进行比较,我们试图找到最佳特征来识别语音的情感状态。我们提出了一种结合DBN和SVM(支持向量机)的新颖分类方法,而不是仅使用其中一种。另外,共轭梯度法用于训练DBN,以加快训练过程。性别相关的实验是使用中国科学院创建的情感语音数据库进行的。结果表明,DBN特征能够比人工特征更好地反映情感状态,并且我们的新分类方法的准确率达到95.8%,高于分别使用DBN或SVM的准确率。结果还表明,如果设计得当,DBN可以很好地用于小型培训数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号