首页> 外文OA文献 >A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
【2h】

A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

机译:基于2D特征空间的跨语言语音情感识别研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation only. The assumption is that the speech audio signal carries sufficient emotional information to detect and retrieve it. Several two-dimensional acoustic feature spaces, such as cochleagrams, spectrograms, mel-cepstrograms, and fractal dimension-based space, are employed as the representations of speech emotional features. A convolutional neural network (CNN) is used as a classifier. The results show the superiority of cochleagrams over other feature spaces utilized. In the CNN-based speaker-independent cross-linguistic speech emotion recognition (SER) experiment, the accuracy of over 90% is achieved, which is close to the monolingual case of SER.
机译:在本研究中,执行了跨语言语音情感识别的研究。为此,收集了不同语言的情感数据(英语,立陶宛语,德语,西班牙语,塞尔维亚和波兰语),导致跨语言语音情绪数据集,其尺寸超过10.000个情绪。尽管收集了数据库的双模形角色,但我们的重点仅在声学表示上。假设是语音音频信号带有足够的情绪信息来检测和检索它。几个二维声学特征空间,例如耳蜗,谱图,Mel-epthrems和基于分形维数的空间,作为语音情绪特征的表示。卷积神经网络(CNN)用作分类器。结果表明,在所使用的其他特征空间上显示了耳蜗仪的优越性。在基于CNN的扬声器无关的交叉语言语言情绪识别(SER)实验中,实现了超过90%的准确性,这是靠近SER的单声道情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号