Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning

机译：视听情绪预测：使用深度学习表征和预测未来的情绪

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Emotion forecasting is the task of predicting the future emotion of a speaker-i.e., the emotion label of the future speaking turn-based on the speaker's past and current audiovisual cues. Emotion forecasting systems require new problem formulations that differ from traditional emotion recognition systems. In this paper, we first explore two types of forecasting windows (i.e., analysis windows for which the speaker's emotion is being forecasted): utterance forecasting and time forecasting. Utterance forecasting is based on speaking turns and forecasts what the speaker's emotion will be after one, two, or three speaking turns. Time forecasting forecasts what the speaker's emotion will be after a certain range of time, such as 3-8, 8- 13, and 13-18 seconds. We then investigate the benefit of using the past audio-visual cues in addition to the current utterance. We design emotion forecasting models using deep learning. We compare the performances of fully-connected deep neural network (FC-DNN), deep long short-term memory (D-LSTM), and deep bidirectional long short-term memory (D-BLSTM) recurrent neural networks (RNNs). This allows us to examine the benefit of modeling dynamic patterns in emotion forecasting tasks. Our experimental results on the IEMOCAP benchmark dataset demonstrate that D-BLSTM and D-LSTM outperform FC-DNN by up to 2.42% in unweighted recall. When using both the current and past utterances, deep dynamic models show an improvement of up to 2.39% compared to their performance when using only the current utterance. We further analyze the benefit of using current and past utterance information compared to using the current and randomly chosen utterance information, and we find the performance improvement rises to 7.53%. The novelty in this study comes from its formulation of emotion forecasting problems and the understanding of how current and past audio-visual cues reveal future emotional information.

机译：情绪预测是根据说话者的过去和当前视听线索来预测说话者未来情绪的任务，即未来讲话转折的情绪标签。情绪预测系统需要与传统情绪识别系统不同的新问题表述。在本文中，我们首先探讨两种类型的预测窗口（即预测说话者情绪的分析窗口）：话语预测和时间预测。言语预测是基于说话的转弯，并预测说话者在一，两个或三个转弯后的情绪。时间预测可以预测在一定时间范围（例如3-8、8-13、13-18秒）后说话者的情绪。然后，我们将调查使用当前语音之外的其他使用过去的视听提示的好处。我们使用深度学习设计情绪预测模型。我们比较了完全连接的深度神经网络（FC-DNN），深度长期短期记忆（D-LSTM）和深度双向长期短期记忆（D-BLSTM）递归神经网络（RNN）的性能。这使我们能够检查在情绪预测任务中对动态模式进行建模的好处。我们在IEMOCAP基准数据集上的实验结果表明，在未加权召回率方面，D-BLSTM和D-LSTM的性能比FC-DNN高出2.42％。当同时使用当前话语和过去话语时，与仅使用当前话语时的性能相比，深度动态模型显示最多可提高2.39％。与使用当前和随机选择的语音信息相比，我们进一步分析了使用当前和过去的语音信息的好处，我们发现性能提高了7.53％。这项研究的新颖性源于其对情绪预测问题的阐述以及对当前和过去视听线索如何显示未来情绪信息的理解。

著录项

来源
《International Conference on Automatic Face and Gesture Recognition》|2019年|1-7|共7页
会议地点 Lille(FR)
作者
Sadat Shahriar; Yelin Kim;
展开▼
作者单位

Electrical and Computer Engineering University at Albany SUNY USA;

Amazon Lab126 USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
emotion recognition; learning (artificial intelligence); recurrent neural nets; speech recognition;

机译：情绪识别；学习（人工智能）；递归神经网络语音识别;

相似文献

外文文献
中文文献
专利

1. Leveraging recent advances in deep learning for audio-Visual emotion recognition [J] . Schoneveld Liam, Othmani Alice, Abdelkawy Hazem Pattern recognition letters . 2021,第Juna期

机译：利用最近的视听情感认可深度学习的进步
2. An Audio-Visual Emotion Recognition System Using Deep Learning Fusion for a Cognitive Wireless Framework [J] . M. Shamim Hossain, Ghulam Muhammad IEEE Wireless Communications . 2019,第3期

机译：基于深度学习融合的认知无线框架视听情感识别系统
3. Emotion recognition using deep learning approach from audio-visual emotional big data [J] . Hossain M. Shamim, Muhammad Ghulam Information Fusion . 2019,第期

机译：从视听情绪大数据使用深度学习方法的情感认可
4. Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning [C] . Sadat Shahriar, Yelin Kim International Conference on Automatic Face and Gesture Recognition . 2019

机译：视听情绪预测：使用深度学习表征和预测未来情感
5. Creating Artificial Intelligence: An Inductive Study of How Creative Workers Forecast the Future and Manage Present Emotions [D] . Hagtvedt, Lydia Paine 2019

机译：创建人工智能：关于创意工作者如何预测未来和管理当前情绪的归纳研究
6. EVENT PREDICTION AND AFFECTIVE FORECASTING IN DEPRESSIVE COGNITION: USING EMOTION AS INFORMATION ABOUT THE FUTURE [O] . BRETT MARROQUÍN, SUSAN NOLEN-HOEKSEMA -1

机译：抑郁认知中的事件预测和有效预测：使用情绪作为有关未来的信息
7. Leveraging recent advances in deep learning for audio-Visual emotion recognition [O] . Liam Schoneveld, Alice Othmani, Hazem Abdelkawy 2021

机译：利用最近的视听情感认可深度学习的进步

Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning

摘要

著录项

相似文献

相关主题

期刊订阅