首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning
【24h】

Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning

机译:通过深入学习实现与情感语音转换的手语

获取原文

摘要

This paper proposes a framework of sign language to emotional speech conversion based on deep learning to solve communication disorders between people with language barriers and healthy people. We firstly trained a gesture recognition model and a facial expression recognition model by a deep convolutional generative adversarial network (DCGAN). Then we trained an emotional speech acoustic model with a hybrid long short-term memory (LSTM). We select the initials and the finals of Mandarin as the emotional speech synthesis units to train a speaker-independent average voice model (AVM). The speaker adaptation is applied to train a speaker-dependent hybrid LST-M model with one target speaker emotional corpus from AVM. Finally, we combine the gesture recognition model and facial expression recognition model with the emotional speech synthesis model to realize the sign language to emotional speech conversion. The experiments show that the recognition rate of gesture recognition is 93.96%, and the recognition rate of facial expression recognition in the CK+ database is 96.01%. The converted emotional speech not only has high quality but also can accurately express the facial expression.
机译:本文提出了一种基于深度学习的情感语音转换框架,以解决语言障碍和健康人民之间的通信障碍。我们首先通过深卷积生成的对抗网络(DCGAN)训练了手势识别模型和面部表情识别模型。然后,我们培训了一种具有混合长短期记忆(LSTM)的情绪语音声学模型。我们选择普通话的首字母和决赛作为培训扬声器的平均语音模型(AVM)的情绪语音综合单位。扬声器适应应用于培训扬声器依赖的混合LST-M模型,其中一个目标扬声器情绪来自AVM。最后,我们将手势识别模型和面部表情识别模型与情绪语音合成模型结合起来,实现了对情绪语音转换的手语。实验表明,手势识别的识别率为93.96%,CK +数据库中的面部表情识别识别率为96.01%。转换的情绪言论不仅具有高质量,而且可以准确表达面部表情。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号