A Gesture-to-Emotional Speech Conversion by Combining Gesture Recognition and Facial Expression Recognition

机译：通过组合手势识别和面部表情识别来举办姿态与情绪语音转换

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper proposes a facial expression integrated sign language to emotional speech conversion method to solve the communication problems between healthy people and speech disorders. Firstly, the characteristics of sign language and the features of facial expression are obtained by a deep neural network (DNN) model. Secondly, a support vector machine (SVM) are trained to classify the sign language and facial expression for recognizing the text of sign language and emotional tags of facial expression. At the same time, a hidden Markov model-based Mandarin-Tibetan bilingual emotional speech synthesizer is trained by speaker adaptive training with a Mandarin emotional speech corpus. Finally, the Mandarin or Tibetan emotional speech is synthesized from the recognized text of sign language and emotional tags. The objective tests show that the recognition rate for static sign language is 90.7%. The recognition rate of facial expression achieves 94.6% on the extended CohnKanade database (CK+) and 80.3% on the JAFFE database respectively. Subjective evaluation demonstrates that synthesized emotional speech can get 4.0 of the emotional mean opinion score. The pleasure-arousal-dominance (PAD) tree dimensional emotion model is employed to evaluate the PAD values for both facial expression and synthesized emotional speech. Results show that the PAD values of facial expression are close to the PAD values of synthesized emotional speech. This means that the synthesized emotional speech can express the emotions of facial expression.

机译：本文提出了一种面部表情综合标志语言，以解决健康人和语音障碍之间的沟通问题。首先，通过深神经网络（DNN）模型获得标牌语言的特征和面部表情的特征。其次，训练支持向量机（SVM）以对识别面部表情的手语和情感标签的文本来分类标志语言和面部表情。与此同时，一个基于马尔可夫模型的普通话 - 藏语双语情绪语音合成器训练由讲话者自适应培训，与普通话情绪语音语料库。最后，普通话或西藏情绪演讲是从签字语言和情感标签的公认文本中合成的。客观测试表明，静态标志语言的识别率为90.7 ％。面部表情的识别率分别在jaffe数据库上的扩展Cohnkanade数据库（CK +）和80.3 ％上实现了94.6 ％。主观评估表明，合成的情绪言论可以获得4.0的情绪平均观点分数。使用令人愉快的唤醒 - 优势（PAD）树维情感模型来评估面部表情和合成情绪言论的焊盘值。结果表明，面部表情的焊盘值接近合成情绪语音的焊盘值。这意味着合成的情绪言论可以表达面部表情的情绪。

著录项

来源
《Asian Conference on Affective Computing and Intelligent Interaction》|2018年|1 v.|共6页
会议地点
作者
Nan Song; Hongwu Yang; Peiwen Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Gesture recognition; Assistive technology; Speech recognition; Face recognition; Emotion recognition; Feature extraction; Hidden Markov models;

机译：手势识别;辅助技术;语音识别;人脸识别;情感识别;特征提取;隐藏的马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis [J] . Loic Kessous, Ginevra Castellano, George Caridakis Journal on multimodal user interfaces . 2010,第1a2期

机译：基于表情，身体手势和声学分析的基于语音的交互中的多模式情感识别
2. Affective state recognition from hand gestures and facial expressions using Grassmann manifolds [J] . Verma Bindu, Choudhary Ayesha Multimedia Tools and Applications . 2021,第9期

机译：使用基层歧管的手势和面部表情的情感状态识别
3. Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech [J] . Madoka Miki, Norihide Kitaoka, Chiyomi Miyajima, EURASIP journal on audio, speech, and music processing . 2014,第1期

机译：利用手势和伴随语音之间的时间间隔改善多模式手势和语音识别性能
4. A Gesture-to-Emotional Speech Conversion by Combining Gesture Recognition and Facial Expression Recognition [C] . Nan Song, Hongwu Yang, Peiwen Wu 2018 First Asian Conference on Affective Computing and Intelligent Interaction . 2018

机译：手势识别和面部表情识别相结合的手势到情感语音转换
5. Integrating gesture recognition and speech recognition in a touch-less human computer interaction system. [D] . Purkayastha, Bhaskar. 2009

机译：在非接触式人机交互系统中集成手势识别和语音识别。
6. Combined Hand Gesture — Speech Model for Human Action Recognition [O] . Sheng-Tzong Cheng, Chih-Wei Hsu, Jian-Pan Li 2013

机译：组合手势-用于人类动作识别的语音模型
7. Visual recognition of activities, gestures, facial expressions and speech: an introduction and a perspective [O] . Mubarak Shah, Ramesh Jain 1997

机译：视觉识别活动，手势，面部表情和语音：简介和观点

A Gesture-to-Emotional Speech Conversion by Combining Gesture Recognition and Facial Expression Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅