首页> 外文会议>National Conference on Communications >Emotion Recognition from Varying Length Patterns of Speech using CNN-based Segment-Level Pyramid Match Kernel based SVMs
【24h】

Emotion Recognition from Varying Length Patterns of Speech using CNN-based Segment-Level Pyramid Match Kernel based SVMs

机译:使用基于CNN的段级金字塔匹配内核的SVM从不同长度的语音模式进行情绪识别

获取原文

摘要

Convolutional Neural Networks (CNNs) and its variants have achieved impressive performance when used for different speech processing tasks like spoken language identification, speaker verification, speech emotion recognition, etc. Conventionally, CNNs for speech applications consider input features from fixed duration speech segments as input. In this work, we attempt to consider features from complete speech signal as input to CNN. We propose to use spatial pyramid pooling (SPP) layer in CNN architecture to remove the fixed length constraint and to consider features from varying length speech signals as input to CNN for an end to end training. Proposed architecture also results in varying size set of feature maps from convolution layer. Further, we propose novel CNN-based segment-level pyramid match kernel (CNN-SLPMK) as dynamic kernel between a pair of varying size set of feature maps for the classification framework using support vector machines (SVMs) based classifier. We demonstrate that our proposed approach achieves comparable results with state-of-the-art techniques for speech emotion recognition task.
机译:当卷积神经网络(CNN)及其变体用于不同的语音处理任务(例如口语识别,说话者验证,语音情感识别等)时,已经取得了令人印象深刻的性能。常规上,用于语音应用的CNN将固定持续时间语音段的输入特征视为输入。在这项工作中,我们尝试将来自完整语音信号的特征考虑为CNN的输入。我们建议在CNN架构中使用空间金字塔池(SPP)层来消除固定长度约束,并考虑将可变长度语音信号中的特征作为输入到CNN进行端到端训练。提议的体系结构还会导致卷积层的特征图大小不同。此外,我们提出了新颖的基于CNN的段级金字塔匹配内核(CNN-SLPMK),作为使用基于支持向量机(SVM)的分类器在分类框架的一对大小不一的特征图对之间的动态内核。我们证明了我们提出的方法可以通过语音情感识别任务的最新技术获得可比的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号