Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning

机译：通过深入学习实现与情感语音转换的手语

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper proposes a framework of sign language to emotional speech conversion based on deep learning to solve communication disorders between people with language barriers and healthy people. We firstly trained a gesture recognition model and a facial expression recognition model by a deep convolutional generative adversarial network (DCGAN). Then we trained an emotional speech acoustic model with a hybrid long short-term memory (LSTM). We select the initials and the finals of Mandarin as the emotional speech synthesis units to train a speaker-independent average voice model (AVM). The speaker adaptation is applied to train a speaker-dependent hybrid LST-M model with one target speaker emotional corpus from AVM. Finally, we combine the gesture recognition model and facial expression recognition model with the emotional speech synthesis model to realize the sign language to emotional speech conversion. The experiments show that the recognition rate of gesture recognition is 93.96%, and the recognition rate of facial expression recognition in the CK+ database is 96.01%. The converted emotional speech not only has high quality but also can accurately express the facial expression.

机译：本文提出了一种基于深度学习的情感语音转换框架，以解决语言障碍和健康人民之间的通信障碍。我们首先通过深卷积生成的对抗网络（DCGAN）训练了手势识别模型和面部表情识别模型。然后，我们培训了一种具有混合长短期记忆（LSTM）的情绪语音声学模型。我们选择普通话的首字母和决赛作为培训扬声器的平均语音模型（AVM）的情绪语音综合单位。扬声器适应应用于培训扬声器依赖的混合LST-M模型，其中一个目标扬声器情绪来自AVM。最后，我们将手势识别模型和面部表情识别模型与情绪语音合成模型结合起来，实现了对情绪语音转换的手语。实验表明，手势识别的识别率为93.96％，CK +数据库中的面部表情识别识别率为96.01％。转换的情绪言论不仅具有高质量，而且可以准确表达面部表情。

著录项

来源
《International Symposium on Chinese Spoken Language Processing》|2021年|1-5|共5页
会议地点
作者
Weizhe WANG; Hongwu YANG;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Adaptation models; Emotion recognition; Face recognition; Assistive technology; Gesture recognition; Speech recognition; Speech synthesis;

机译：适应模型;情感识别;人脸识别;辅助技术;手势识别;语音识别;语音合成;

相似文献

外文文献
中文文献
专利

1. Deep Learning Based Part-of-Speech Tagging for Malayalam Twitter Data (Special Issue: Deep Learning Techniques for Natural Language Processing) [J] . S.Kumar, M. AnandKumar, K.P.Soman Journal of Intelligent Systems . 2019,第3期

机译：基于深入学习的Malayalam Twitter数据的语音标记（特殊问题：自然语言处理的深度学习技巧）
2. Real time conversion of sign language to speech and prediction of gestures using Artificial Neural Network [J] . Abey Abraham, V Rohini Procedia Computer Science . 2018,第5期

机译：使用人工神经网络的实时转换手语与语音的语音和预测
3. A Real-time Continuous Alphabetic Sign Language to Speech Conversion VR System [J] . Rung-Huei Liang, Ming Ouhyoung Computer Graphics Forum . 1995,第3期

机译：实时连续字母手语到语音转换虚拟现实系统
4. A deep learning based framework for converting sign language to emotional speech [C] . Nan Song, Hongwu Yang, Pengpeng Zhi Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2018

机译：基于深度学习的框架，用于将手语转换为情感语音
5. Recognizing American Sign Language Using Deep Learning [D] . Kajonpong, Punsak. 2019

机译：使用深度学习识别美国手语
6. The effects of learning American Sign Language on co-speech gesture [O] . SHANNON CASEY, KAREN EMMOREY, HEATHER LARRABEE -1

机译：学习美国手语对共同语音手势的影响
7. Hand Sign to Bangla Speech: A Deep Learning in Vision Based System for Recognizing Hand Sign Digits and Generating Bangla Speech [O] . Shahjalal Ahmed, Md. Rafiqul Islam, Jahid Hassan, 2019

机译：手标志到Bangla演讲：基于视觉系统的深入学习，识别手标志数字和生成Bangla演讲

Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅