Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning

机译：视听情绪预测：使用深度学习表征和预测未来情感

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Emotion forecasting is the task of predicting the future emotion of a speaker-i.e., the emotion label of the future speaking turn-based on the speaker's past and current audiovisual cues. Emotion forecasting systems require new problem formulations that differ from traditional emotion recognition systems. In this paper, we first explore two types of forecasting windows (i.e., analysis windows for which the speaker's emotion is being forecasted): utterance forecasting and time forecasting. Utterance forecasting is based on speaking turns and forecasts what the speaker's emotion will be after one, two, or three speaking turns. Time forecasting forecasts what the speaker's emotion will be after a certain range of time, such as 3-8, 8- 13, and 13-18 seconds. We then investigate the benefit of using the past audio-visual cues in addition to the current utterance. We design emotion forecasting models using deep learning. We compare the performances of fully-connected deep neural network (FC-DNN), deep long short-term memory (D-LSTM), and deep bidirectional long short-term memory (D-BLSTM) recurrent neural networks (RNNs). This allows us to examine the benefit of modeling dynamic patterns in emotion forecasting tasks. Our experimental results on the IEMOCAP benchmark dataset demonstrate that D-BLSTM and D-LSTM outperform FC-DNN by up to 2.42% in unweighted recall. When using both the current and past utterances, deep dynamic models show an improvement of up to 2.39% compared to their performance when using only the current utterance. We further analyze the benefit of using current and past utterance information compared to using the current and randomly chosen utterance information, and we find the performance improvement rises to 7.53%. The novelty in this study comes from its formulation of emotion forecasting problems and the understanding of how current and past audio-visual cues reveal future emotional information.

机译：情感预测是预测扬声器的未来情绪的任务。，未来的情感标签，基于演讲者的过去和当前的视听线索。情绪预测系统需要与传统情感识别系统不同的新问题配方。在本文中，我们首先探索两种类型的预测窗口（即，正在预测扬声器的情绪的分析窗口）：话语预测和时间预测。话语预测是基于口语转弯和预测演讲者的情绪将在一个，两个或三个口交之后。时间预测预测演讲者的情绪在一定时间内将是什么，例如3-8,8-13和13-18秒。然后，我们还调查使用过去的视听线索除了当前话语之外的好处。我们使用深入学习设计情感预测模型。我们比较完全连接的深神经网络（FC-DNN），深长短期存储器（D-LSTM）和深双向短期内记忆（D-BLSTM）经常性神经网络（RNN）的性能。这使我们能够检查情绪预测任务中的动态模式建模的好处。我们在IEMocap基准数据集上的实验结果表明，D-BLSTM和D-LSTM在未加权召回中优于2.42％的FC-DNN。在使用当前和过去的话语时，与仅使用当前话语时的性能相比，深度动态模型显示出高达2.39％的增长。我们进一步分析了使用电流和随机选择的话语信息相比使用电流和过去的话语信息的好处，并且我们发现性能提高到7.53％。本研究的新颖性来自其对情绪预测问题的制定，并了解当前和过去的视听提示如何揭示未来的情绪信息。

著录项

来源
《International Conference on Automatic Face and Gesture Recognition》|2019年|753p|共7页
会议地点
作者
Sadat Shahriar; Yelin Kim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
emotion recognition; learning (artificial intelligence); recurrent neural nets; speech recognition;

机译：情绪识别;学习（人工智能）;经常性神经网络;语音识别;

相似文献

外文文献
中文文献
专利

1. Leveraging recent advances in deep learning for audio-Visual emotion recognition [J] . Schoneveld Liam, Othmani Alice, Abdelkawy Hazem Pattern recognition letters . 2021,第Juna期

机译：利用最近的视听情感认可深度学习的进步
2. An Audio-Visual Emotion Recognition System Using Deep Learning Fusion for a Cognitive Wireless Framework [J] . M. Shamim Hossain, Ghulam Muhammad IEEE Wireless Communications . 2019,第3期

机译：基于深度学习融合的认知无线框架视听情感识别系统
3. Emotion recognition using deep learning approach from audio-visual emotional big data [J] . Hossain M. Shamim, Muhammad Ghulam Information Fusion . 2019,第期

机译：从视听情绪大数据使用深度学习方法的情感认可
4. Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning [C] . Sadat Shahriar, Yelin Kim International Conference on Automatic Face and Gesture Recognition . 2019

机译：视听情绪预测：使用深度学习表征和预测未来的情绪
5. Creating Artificial Intelligence: An Inductive Study of How Creative Workers Forecast the Future and Manage Present Emotions [D] . Hagtvedt, Lydia Paine 2019

机译：创建人工智能：关于创意工作者如何预测未来和管理当前情绪的归纳研究
6. EVENT PREDICTION AND AFFECTIVE FORECASTING IN DEPRESSIVE COGNITION: USING EMOTION AS INFORMATION ABOUT THE FUTURE [O] . BRETT MARROQUÍN, SUSAN NOLEN-HOEKSEMA -1

机译：抑郁认知中的事件预测和有效预测：使用情绪作为有关未来的信息
7. Leveraging recent advances in deep learning for audio-Visual emotion recognition [O] . Liam Schoneveld, Alice Othmani, Hazem Abdelkawy 2021

机译：利用最近的视听情感认可深度学习的进步

Audio-Visual Emotion Forecasting: Characterizing and Predicting Future Emotion Using Deep Learning

摘要

著录项

相似文献

相关主题

期刊订阅