首页> 外文期刊>IEEE transactions on multimedia >Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition
【24h】

Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition

机译:在语音情感识别中连接子空间学习和极限学习机

获取原文
获取原文并翻译 | 示例

摘要

Speech emotion recognition (SER) is a powerful tool for endowing computers with the capacity to process information about the affective states of users in human-machine interactions. Recent research has shown the effectiveness of graph embedding-based subspace learning and extreme learning machine applied to SER, but there are still various drawbacks in these two techniques that limit their application. Regarding subspace learning, the change from linearity to nonlinearity is usually achieved through kernelization, whereas extreme learning machines only take label information into consideration at the output layer. In order to overcome these drawbacks, this paper leverages extreme learning machines for dimensionality reduction and proposes a novel framework to combine spectral regression-based subspace learning and extreme learning machines. The proposed framework contains three stages-data mapping, graph decomposition, and regression. At the data mapping stage, various mapping strategies provide different views of the samples. At the graph decomposition stage, specifically designed embedding graphs provide a possibility to better represent the structure of data through generating virtual coordinates. Finally, at the regression stage, dimension-reduced mappings are achieved by connecting the virtual coordinates and data mapping. Using this framework, we propose several novel dimensionality reduction algorithms, apply them to SER tasks, and compare their performance to relevant state-of-the-art methods. Our results on several paralinguistic corpora show that our proposed techniques lead to significant improvements.
机译:语音情感识别(SER)是赋予计算机强大功能的工具,能够处理有关人机交互中用户情感状态的信息。最近的研究表明,将基于图嵌入的子空间学习和极限学习机应用于SER的有效性,但是这两种技术仍然存在各种缺点,限制了它们的应用。关于子空间学习,通常是通过核化来实现从线性到非线性的转变,而极限学习机仅在输出层考虑标签信息。为了克服这些缺点,本文利用极限学习机来减少维数,并提出了一种新颖的框架,将基于光谱回归的子空间学习和极限学习机相结合。所提出的框架包含三个阶段-数据映射,图形分解和回归。在数据映射阶段,各种映射策略提供了样本的不同视图。在图分解阶段,专门设计的嵌入图提供了通过生成虚拟坐标更好地表示数据结构的可能性。最后,在回归阶段,通过连接虚拟坐标和数据映射来实现降维映射。使用此框架,我们提出了几种新颖的降维算法,将它们应用于SER任务,并将其性能与相关的最新方法进行比较。我们对几种副语言语料库的研究结果表明,我们提出的技术导致了重大改进。

著录项

  • 来源
    《IEEE transactions on multimedia》 |2019年第3期|795-808|共14页
  • 作者单位

    Nanjing Univ Posts & Telecommun, Coll Internet Things, Nanjing 210003, Jiangsu, Peoples R China|Southeast Univ, Minist Educ, Key Lab Underwater Acoust Signal Proc, Nanjing 210096, Jiangsu, Peoples R China|Tech Univ Munich, MMK, Machine Intelligence & Signal Proc Grp, D-80290 Munich, Germany;

    AudEERING GmbH, D-82205 Gilching, Germany;

    Imperial Coll London, Dept Comp, London SW7 2AZ, England|Univ Liverpool, Dept Mus, Liverpool L69 3BX, Merseyside, England;

    Nanjing Univ Posts & Telecommun, Coll Internet Things, Nanjing 210003, Jiangsu, Peoples R China;

    Southeast Univ, Minist Educ, Key Lab Underwater Acoust Signal Proc, Nanjing 210096, Jiangsu, Peoples R China;

    Imperial Coll London, Grp Language Audio & Mus, London SW7 2AZ, England|Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, D-86159 Augsburg, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Speech emotion recognition; extreme learning machine; subspace learning; graph embedding; spectral regression;

    机译:语音情感识别;极限学习机;子空间学习;图形嵌入;光谱回归;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号