首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech
【24h】

Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech

机译:通过对无标签语音进行无监督表示学习来提高语音情感识别能力

获取原文

摘要

In this paper we present our findings on how representation learning on large unlabeled speech corpora can be beneficially utilized for speech emotion recognition (SER). Prior work on representation learning for SER mostly focused on the relatively small emotional speech datasets without making use of additional unlabeled speech data. We show that integrating representations learnt by an unsupervised autoencoder into a CNN-based emotion classifier improves the recognition accuracy. To gain insights about what those models learn, we analyze visualizations of the different representations using t-distributed neighbor embeddings (t-SNE). We evaluate our approach on IEMOCAP and MSP-IMPROV by means of within- and cross-corpus testing.
机译:在本文中,我们介绍了有关如何将大型未标记语音语料库上的表示学习有效地用于语音情感识别(SER)的发现。 SER的表示学习的先前工作主要集中在相对较小的情感语音数据集上,而不使用其他未标记的语音数据。我们表明,将无监督的自动编码器学习的表示形式集成到基于CNN的情感分类器中,可以提高识别的准确性。为了获得有关这些模型学习内容的见解,我们使用t分布邻居嵌入(t-SNE)分析了不同表示形式的可视化。我们通过内部和跨主体测试来评估针对IEMOCAP和MSP-IMPROV的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号