Multimodal Emotion Classification by Streaming Fixed Time Segments for Speaker Movies

机译：通过流式传输扬声器电影的固定时间段来分类多模式情绪分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The approach to Video-Audio Emotion Recognition takes advantage of gaining additional information from multimodalites. Since the target features are time related without strict alignment in time, video-audio features become simply video features and audio features. Exploring toward such a goal, spectrogram as outstanding vocal feature in neural network solution is selected to get benefits of convolution filters. Inspired by solution of image captioning of LSTM where embedded words information and image information arc spatially aligned, we perform embedding of the audio spectrogram and image sequences since time information is converted to spatial information in spectrogram. We propose both architecture and framework optimizing the alignment of the mentioned temporal features and we provide the analysis of the significant performance improvement along with the discussion of the Video-Audio Emotion Recognition general tasks.

机译：视频音频情感识别的方法利用了来自多模锰石的其他信息。由于目标特征是随时间严格对齐的时间，因此视频 - 音频功能变得只是视频功能和音频功能。选择探讨这种目标，选择频谱图作为神经网络解决方案中的优秀声音特征，以获得卷积过滤器的好处。灵感来自LSTM的图像标题的解决方案，其中嵌入词信息和图像信息弧空间对齐，我们执行音频频谱图和图像序列的嵌入，因为时间信息被转换为频谱图中的空间信息。我们提出了两种架构和框架，优化了所提到的时间特征的对齐，我们提供了对视频音频情绪识别常规任务的讨论的显着性能改进的分析。

著录项

来源
《Conference on Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 》|2020年|1158108.1-1158108.12|共12页
会议地点
作者
Xin Chang; Wladyslaw Skarbek;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Multimodal emotional classification; video-audio features alignment; features embedding;

机译：多模式情绪分类;视频音频功能对齐;功能嵌入;

相似文献

外文文献
中文文献
专利

1. Correction to: Attention-based multimodal contextual fusion for sentiment and emotion classification using bidirectional LSTM [J] . Huddar Mahesh G., Sannakki Sanjeev S., Rajpurohit Vijay S. Multimedia Tools and Applications . 2021 ,第9期

机译：校正：使用双向LSTM的情感和情感分类的关注多峰语境融合
2. Attention-based multimodal contextual fusion for sentiment and emotion classification using bidirectional LSTM [J] . Huddar Mahesh G., Sannakki Sanjeev S., Rajpurohit Vijay S. Multimedia Tools and Applications . 2021 ,第9期

机译：使用双向LSTM的情感和情感分类的关注多模式语境融合
3. Multi-level feature optimization and multimodal contextual fusion for sentiment analysis and emotion classification [J] . Mahesh G. Huddar, Sanjeev S. Sannakki, Vijay S. Rajpurohit Computational Intelligence . 2020 ,第2期

机译：情感分析与情感分类的多级特征优化与多模式语境融合
4. Multimodal Relational Tensor Network for Sentiment and Emotion Classification [C] . Saurav Sahay, Shachi H Kumar, Rui Xia, First grand challenge and workshop on human multimodal language 2018 . 2018

机译：用于情感和情感分类的多模式关系张量网络
5. Multimodal Sensing and Data Processing for Speaker and Emotion Recognition Using Deep Learning Models with Audio, Video and Biomedical Sensors [D] . Abtahi, Farnaz. 2018

机译：使用具有音频，视频和生物医学传感器的深度学习模型，对说话人和情感识别进行多模式传感和数据处理
6. Real-Time Emotion Classification Using EEG Data Stream in E-Learning Contexts [O] . Arijit Nandi, Fatos Xhafa, Laia Subirats, 2021

机译：使用EEG数据流在电子学习环境中的实时情感分类
7. Correction to: Attention-based multimodal contextual fusion for sentiment and emotion classification using bidirectional LSTM [O] . Mahesh G. Huddar, Sanjeev S. Sannakki, Vijay S. Rajpurohit 2021

机译：校正：使用双向LSTM的情感和情感分类的关注多峰语境融合

Multimodal Emotion Classification by Streaming Fixed Time Segments for Speaker Movies

摘要

著录项

相似文献

相关主题

期刊订阅