首页> 中文期刊> 《高技术通讯》 >基于卷积神经网络学习的语音情感特征降维方法研究

基于卷积神经网络学习的语音情感特征降维方法研究

         

摘要

A feature reduction method based on convolution neural network(CNN)is proposed to solve the problem of speech emotion recognition.On the basis of extracting a large number of features of the original speech emotion da-ta,the corresponding feature matrix is obtained by normalizing the different dimension features.The CNN is used to study the feature matrix,and the weights of the CNN network are analyzed.According to the characteristics of the network learning feature,that is,by comparing the activation weights of each class,the features that are most fa-vorable for classification are selected by calculation, so the feature selection criterion FR-CNN is obtained.The multi-modal emotional database CHEAVD provided by the Institute of Automation of Chinese Academy of Sciences is used to test all the eight kinds of emotional data,showing that the average recognition error rate of the CNN clas-sifier constructed with all the feature sets is reduced by 2.1%compared to the baseline results,while the average recognition error rate of the same CNN classifier constructed with dimension reduction F feature set is reduced by 9.4%.In addition,using only 15% of original feature set's features on the basis of dimensional reduction of a large number of features,can not only effectively increase the convergence speed of the classifier, but also make the recognition error rate reduced,at the same time in the actual speech emotion recognition system,the complexity of system can also be reduced.The study provides a new idea for the feature extraction of speech emotion.%针对语音信号认知中需要对语音情感快速精准的解析问题,提出了一种基于卷积神经网络(CNN)学习的特征降维方法.在原始语音情感数据提取大量特征的基础上,通过对不同维度特征进行归正获得其相应的特征矩阵.应用CNN对特征矩阵进行学习,对收敛后的CNN网络全连接层的权值进行分析,根据网络学习特性定义基于CNN的特征筛选准则(FR-CNN),即通过对比每类特征激活权值的不同,计算选择出最有利于分类的特征,得到降维高效的语音情感认知特征集F.在中国科学院自动化研究所提供的多模态情感数据库CHEAVD上,提取全部8类情感数据进行了实验测试,使用全体特征集构建的CNN分类器的类平均识别错误率相比基线减少了2.1%,而本文方法得到的降维后特征集F通过相同的CNN分类器的类平均错误率相比基线减少了9.4%.在对大量特征进行降维筛选的基础上,仅使用原特征集15%的特征,不仅有效增加了分类器的收敛速度,还使得识别错误率有所减小,同时在构筑实际语音情感识别系统时能够减少系统的复杂程度.本研究综合了数据的不同类型的特征信息,采用CNN网络学习特性进行特征二次优选与降维,为语音情感的特征提取问题提供了一个新的思路.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号