Speaker-Invariant Affective Representation Learning via Adversarial Training

机译：通过对抗培训学习演讲者 - 不变情感表现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Representation learning for speech emotion recognition is challenging due to labeled data sparsity issue and lack of gold-standard references. In addition, there is much variability from input speech signals, human subjective perception of the signals and emotion label ambiguity. In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect of speaker variability in the speech signals. Specifically we propose to disentangle the speaker characteristics from emotion through an adversarial training network in order to better represent emotion. Our method combines the gradient reversal technique with an entropy loss function to remove such speaker information. Our approach is evaluated on both IEMOCAP and CMU-MOSEI datasets. We show that our method improves speech emotion classification and increases generalization to unseen speakers.

机译：由于标记为数据稀疏问题和缺乏金标参考，表示语音情感认可的代表性学习是挑战。此外，输入语音信号存在多大可变性，人类主观对信号和情绪标记的歧义。在本文中，我们提出了一种机器学习框架，通过限制语音信号中的扬声器变异性的影响来获得语音情感表示。具体而言，我们建议解开通过对抗训练网络的情绪从情绪中解开扬声器特征，以便更好地代表情绪。我们的方法将梯度反转技术与熵丢失功能相结合以去除此类扬声器信息。我们的方法是在IEMocap和CMU-MOSEI数据集上进行评估。我们表明我们的方法改善了语音情绪分类，并增加了看不见的扬声器的泛化。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p6824-7443|共5页
会议地点
作者
Haoqi Li; Ming Tu; Jing Huang; Shrikanth Narayanan; Panayiotis Georgiou;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Speech emotion recognition; adversarial training; speaker invariant; affective representation;

机译：语音情感识别;对抗训练;发言者不变;情感表示;
入库时间 2022-08-21 09:38:51

相似文献

外文文献
中文文献
专利

1. Local and non-local dependency learning and emergence of rule-like representations in speech data by deep convolutional generative adversarial networks [J] . Gasper Begus Computer speech and language . 2022,第Jana期

机译：深度卷积生成对冲网络，局部和非本地依赖学习和语音数据中的规则样式的出现
2. Learning Localized Representations of Point Clouds With Graph-Convolutional Generative Adversarial Networks [J] . Diego Valsesia, Giulia Fracastoro, Enrico Magli Multimedia, IEEE Transactions on . 2021,第1期

机译：用图形卷积生成的对抗网络学习点云的本地化表示
3. Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion [J] . Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, IEEE Transactions on Emerging Topics in Computational Intelligence . 2020,第4期

机译：基于变化的自动化器语音转换中的跨域特征和对逆势学习的无监督的表示解剖
4. Speaker-Invariant Affective Representation Learning via Adversarial Training [C] . Haoqi Li, Ming Tu, Jing Huang, . 2020

机译：通过对抗训练进行说话人不变情感表达学习
5. Robust and Generalizable Machine Learning Through Generative Models,Adversarial Training, and Physics Priors [D] . Yao, Houpu. 2019

机译：通过生成模型，对抗性训练和物理前瞻学习鲁棒和更广泛的机器
6. Information-Based Boundary Equilibrium Generative Adversarial Networks with Interpretable Representation Learning [O] . Junghoon Hah, Woojin Lee, Jaewook Lee, 2018

机译：具有可解释性表示学习的基于信息的边界均衡生成对抗网络
7. Speaker-Invariant Affective Representation Learning via Adversarial Training [O] . Haoqi Li, Ming Tu, Jing Huang, 2020

机译：通过对抗培训学习演讲者 - 不变情感表示

Speaker-Invariant Affective Representation Learning via Adversarial Training

摘要

著录项

相似文献

相关主题

期刊订阅