首页> 外文期刊>Quality Control, Transactions >Feature Augmenting Networks for Improving Depression Severity Estimation From Speech Signals
【24h】

Feature Augmenting Networks for Improving Depression Severity Estimation From Speech Signals

机译:功能增强网络,用于从语音信号提高抑郁严重性估计

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Depression disorder has become one of the major psychological diseases endangering human health. Researcher in the affective computing community is supporting the development of reliable depression severity estimation system, from multiple modalities (speech, face, text), to assist doctors in their diagnosis. However, the limited amount of annotated data has become the main bottleneck restricting the study on depression screening, especially when deep learning models are used. To alleviate this issue, in this work we propose to use Deep Convolutional Generative Adversarial Network (DCGAN) for features augmentation to improve depression severity estimation from speech. To the best of our knowledge, this approach is the first attempt to apply the Generative Adversarial Network for depression severity estimation from speech. Besides, to measure the quality of the augmented features, we propose three different measurement criteria, characterizing the spatial, frequency and representation learning of the augmented features. Finally, the augmented features are used to train depression estimation models. Experiments are carried out on speech signals from the Audio Visual Emotion Challenge (AVEC2016) depression dataset, and the relationship between the model performance and data size is explored. Our experimental results show that: 1) The combination of the three proposed evaluation criteria can effectively and comprehensively evaluate the quality of the augmented features. 2) When increasing the size of the augmented data, the performance of depression severity estimation gradually improves and the model converges to a certain stable state. 3) The proposed DCGAN based data augmentation approach effectively improves the performance of depression severity estimation, with the root mean square error (RMSE) reduced to 5.520 and mean absolute error (MAE) reduced to 4.634, which is better than most of the state of the art results on AVEC 2016.
机译:抑郁症已成为危及人类健康的主要心理疾病之一。情感计算界的研究人员正在支持可靠的抑郁症严重性估算系统的发展,来自多种方式(语音,面部,文本),协助医生在诊断中。然而,有限的注释数据已成为限制抑郁筛查研究的主要瓶颈,特别是在使用深度学习模型时。为了缓解这个问题,在这项工作中,我们建议使用深度卷积生成的对抗网络(DCGAN)来增强功能,以改善语音的抑郁严重性估算。据我们所知,这种方法是第一次尝试从演讲中申请抑郁症严重性估算的生成对抗网络。此外,为了测量增强功能的质量,我们提出了三种不同的测量标准,表征了增强功能的空间,频率和表示学习。最后,增强功能用于训练抑郁估计模型。实验在来自音频视觉情绪挑战(AVEC2016)抑郁数据集的语音信号上进行,探讨了模型性能和数据大小之间的关系。我们的实验结果表明:1)三个建议的评估标准的组合可以有效,全面评估增强功能的质量。 2)当增加增强数据的大小时,抑郁严重性估计的性能逐渐改进,并且模型会聚到某个稳定状态。 3)所提出的基于DCGAN的数据增强方法有效提高了抑郁严重性估计的性能,具有减少到5.520的根均线误差(RMSE),并且平均误差(MAE)减少到4.634,这优于大部分状态AVEC 2016的艺术结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号