Emotion recognition using speech and neural structured learning to facilitate edge intelligence

Zia Uddin; Erik G. Nilsson

首页> 外文期刊>Engineering Applications of Artificial Intelligence >Emotion recognition using speech and neural structured learning to facilitate edge intelligence

【24h】

Emotion recognition using speech and neural structured learning to facilitate edge intelligence

机译：情感识别使用言语和神经结构化学习，促进边缘智能

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Emotions are quite important in our daily communications and recent years have witnessed a lot of research works to develop reliable emotion recognition systems based on various types data sources such as audio and video. Since there is no apparently visual information of human faces, emotion analysis based on only audio data is a very challenging task. In this work, a novel emotion recognition is proposed based on robust features and machine learning from audio speech. For a person independent emotion recognition system, audio data is used as input to the system from which, Mel Frequency Cepstrum Coefficients (MFCC) are calculated as features. The MFCC features are then followed by discriminant analysis to minimize the inner-class scatterings while maximizing the inter-class scatterings. The robust discriminant features are then applied with an efficient and fast deep learning approach Neural Structured Learning (NSL) for emotion training and recognition. The proposed approach of combining MFCC, discriminant analysis and NSL generated superior recognition rates compared to other traditional approaches such as MFCC-DBN, MFCC-CNN, and MFCC-RNN during the experiments on an emotion dataset of audio speeches. The system can be adopted in smart environments such as homes or clinics to provide affective healthcare. Since NSL is fast and easy to implement, it can be tried on edge devices with limited datasets collected from edge sensors. Hence, we can push the decision-making step towards where data resides rather than conventionally processing of data and making decisions from far away of the data sources. The proposed approach can be applied in different practical applications such as understanding peoples' emotions in their daily life and stress from the voice of the pilots or air traffic controllers in air traffic management systems.

机译：在日常通信中，情绪非常重要，近年来目睹了许多研究作品，以基于音频和视频等各种数据来源开发可靠的情感识别系统。由于没有人称视觉信息，因此基于音频数据的情感分析是一个非常具有挑战性的任务。在这项工作中，提出了一种基于音频语音的强大特征和机器的新颖情感认可。对于人的独立情感识别系统，音频数据用作系统的输入，从中计算MEL频率谱系数（MFCC）作为特征。然后，MFCC特征随后是判别分析，以最小化内部级联散射，同时最大化阶级散射。然后应用了鲁棒的判别特征，以高效且快速的深度学习方法神经结构化学习（NSL）用于情感培训和识别。与MFCC-DBN，MFCC-CNN等其他传统方法（如MFCC-DBN，MFCC-CNN）等传统方法相比，将MFCC，判别分析和NSL与MFCC-RNN在音频语音的情绪数据集上的其他传统方法相比，所提出的识别率。该系统可以在智能环境中采用，例如房屋或诊所，以提供情感医疗保健。由于NSL快速且易于实现，因此可以在边缘设备上尝试，其中具有从边缘传感器收集的有限数据集。因此，我们可以推动数据所在的决策步骤，而不是传统地处理数据并从数据源的远离数据源进行决策。该拟议的方法可以应用于不同的实际应用中，例如了解人们在日常生活中的情绪，从飞行员或空中交通管制系统中的火人的声音中的压力。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2020年第9期|103775.1-103775.11|共11页
作者
Zia Uddin; Erik G. Nilsson;
展开▼
作者单位

SINTEF Digital Forskningsveien 1 Oslo Norway;

SINTEF Digital Forskningsveien 1 Oslo Norway;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Audio; Emotion; MFCC; LDA; NSL;

机译：声音的;Omotion;MFCC;LDA;NSL.;

相似文献

外文文献
中文文献
专利

1. Learning Deep Binaural Representations With Deep Convolutional Neural Networks for Spontaneous Speech Emotion Recognition [J] . Zhang Shiqing, Chen Aihua, Guo Wenping, Quality Control, Transactions . 2020,第期

机译：学习深层卷积神经网络的深层双耳陈述，用于自发言论情绪识别
2. Recognition of speech emotion using custom 2D-convolution neural network deep learning algorithm [J] . Zvarevashe Kudakwashe, Olugbara Oludayo O. Intelligent data analysis . 2020,第5期

机译：使用自定义2D卷积神经网络深度学习算法识别语音情绪
3. Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks [J] . Mao Q., Dong M., Huang Z., Multimedia, IEEE Transactions on . 2014,第8期

机译：使用卷积 /神经网络学习语音情感的显着特征
4. Speech emotion recognition based on multi-task learning using a convolutional neural network [C] . Nam Kyun Kim, Jiwon Lee, Hun Kyu Ha, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2017

机译：基于卷积神经网络的多任务学习语音情感识别
5. Large-Margin Structured Prediction Extensions of Neural Networks for Automatic Speech Recognition [D] . Ravuri, Suman V. 2015

机译：用于自动语音识别的神经网络的大边缘结构预测扩展
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Ensemble Learning With Attention-Integrated Convolutional Recurrent Neural Network for Imbalanced Speech Emotion Recognition [O] . Xusheng Ai, Victor S. Sheng, Wei Fang, 2020

机译：与关注集成卷积经常性神经网络的合奏学习，用于不平衡语音情感识别

Emotion recognition using speech and neural structured learning to facilitate edge intelligence

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅