首页> 外文会议>International Conference on Intelligent Human Computer Interaction >Screening Trauma Through CNN-Based Voice Emotion Classification
【24h】

Screening Trauma Through CNN-Based Voice Emotion Classification

机译:通过基于CNN的语音情感分类筛选创伤

获取原文

摘要

Recently, modern people experience trauma symptom for various reasons. Trauma causes emotional control problems and anxiety. Although a psychiatric diagnosis is essential, people are reluctant to visit hospitals. In this paper, we propose a method for screening trauma based on voice audio data using convolu-tional neural networks. Among the six basic emotions, four emotions were used for screening trauma: fear, sad, happy, and neutral. The first pre-processing of adjusting the length of the audio data in units of 2 s and augmenting the number of data, and the second pre-processing is performed in order to convert voice temporal signal into a spectrogram image by short-time Fourier transform. The spectrogram images are trained through the four convolution neural networks. As a result, VGG-13 model showed the highest performance (98.96%) for screening trauma among others. A decision-level fusion strategy as a post-processing is adopted to determine the final traumatic state by confirming the maintenance of the same continuous state for the traumatic state estimated by the trained VGG-13 model. As a result, it was confirmed that high-accuracy voice-based trauma diagnosis is possible according to the setting value for continuous state observation.
机译:最近,现代人因各种原因体验创伤症状。创伤导致情绪控制问题和焦虑。虽然精神诊断至关重要,但人们不愿意参观医院。在本文中,我们提出了一种基于使用卷积神经网络的语音音频数据筛选创伤的方法。在六种基本情绪中,四种情绪用于筛查创伤:恐惧,悲伤,快乐和中立。以2 s为单位调整音频数据的长度并增强数据数量的第一个预处理,以及通过短时傅里叶变换将语音时间信号转换为频谱图图像中的第二预处理。频谱图图像通过四个卷积神经网络培训。结果,VGG-13模型显示出最高的性能(98.96%),用于筛选创伤等。采用决策级融合策略作为后处理来确定最终创伤状态,通过确认由训练的VGG-13模型估计的创伤状态的保持相同的连续状态。结果,确认,根据连续状态观察的设定值,可以实现高精度的语音基创伤诊断。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号