首页> 外文期刊>Engineering Applications of Artificial Intelligence >Emotion recognition using speech and neural structured learning to facilitate edge intelligence
【24h】

Emotion recognition using speech and neural structured learning to facilitate edge intelligence

机译:情感识别使用言语和神经结构化学习,促进边缘智能

获取原文
获取原文并翻译 | 示例
           

摘要

Emotions are quite important in our daily communications and recent years have witnessed a lot of research works to develop reliable emotion recognition systems based on various types data sources such as audio and video. Since there is no apparently visual information of human faces, emotion analysis based on only audio data is a very challenging task. In this work, a novel emotion recognition is proposed based on robust features and machine learning from audio speech. For a person independent emotion recognition system, audio data is used as input to the system from which, Mel Frequency Cepstrum Coefficients (MFCC) are calculated as features. The MFCC features are then followed by discriminant analysis to minimize the inner-class scatterings while maximizing the inter-class scatterings. The robust discriminant features are then applied with an efficient and fast deep learning approach Neural Structured Learning (NSL) for emotion training and recognition. The proposed approach of combining MFCC, discriminant analysis and NSL generated superior recognition rates compared to other traditional approaches such as MFCC-DBN, MFCC-CNN, and MFCC-RNN during the experiments on an emotion dataset of audio speeches. The system can be adopted in smart environments such as homes or clinics to provide affective healthcare. Since NSL is fast and easy to implement, it can be tried on edge devices with limited datasets collected from edge sensors. Hence, we can push the decision-making step towards where data resides rather than conventionally processing of data and making decisions from far away of the data sources. The proposed approach can be applied in different practical applications such as understanding peoples' emotions in their daily life and stress from the voice of the pilots or air traffic controllers in air traffic management systems.
机译:在日常通信中,情绪非常重要,近年来目睹了许多研究作品,以基于音频和视频等各种数据来源开发可靠的情感识别系统。由于没有人称视觉信息,因此基于音频数据的情感分析是一个非常具有挑战性的任务。在这项工作中,提出了一种基于音频语音的强大特征和机器的新颖情感认可。对于人的独立情感识别系统,音频数据用作系统的输入,从中计算MEL频率谱系数(MFCC)作为特征。然后,MFCC特征随后是判别分析,以最小化内部级联散射,同时最大化阶级散射。然后应用了鲁棒的判别特征,以高效且快速的深度学习方法神经结构化学习(NSL)用于情感培训和识别。与MFCC-DBN,MFCC-CNN等其他传统方法(如MFCC-DBN,MFCC-CNN)等传统方法相比,将MFCC,判别分析和NSL与MFCC-RNN在音频语音的情绪数据集上的其他传统方法相比,所提出的识别率。该系统可以在智能环境中采用,例如房屋或诊所,以提供情感医疗保健。由于NSL快速且易于实现,因此可以在边缘设备上尝试,其中具有从边缘传感器收集的有限数据集。因此,我们可以推动数据所在的决策步骤,而不是传统地处理数据并从数据源的远离数据源进行决策。该拟议的方法可以应用于不同的实际应用中,例如了解人们在日常生活中的情绪,从飞行员或空中交通管制系统中的火人的声音中的压力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号