Emotional Speaker Recognition based on Machine and Deep Learning

机译：基于机器和深度学习的情绪扬声器识别

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Speaker recognition is a method which recognise a speaker from characteristics of a voice. Speaker recognition technologies have been widely used in many domains. Most speaker recognition systems have been trained on normal clean recordings, however the performance of these speaker recognition systems tends to degrade when recognising speech which has emotions. This paper presents an emotional speaker recognition system trained using machine and deep learning algorithms using time, frequency and spectral features on emotional speech database acquired from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). We trained and compared the performance of five machine learning models (Logistic Regression, Support Vector Machine, Random Forest, XGBoost, and k-Nearest Neighbor), and three deep learning models (Long Short-Term Memory network, Multilayer Perceptron, and Convolutional Neural Network). After the evaluation of the models, the deep neural networks showed good performance compared to machine learning models by attaining the highest accuracy of 92% outperforming the state-of-the-art models in emotional speaker detection from speech signals.

机译：说话人识别是其识别的扬声器从声音的特性的方法。说话人识别技术已被广泛应用于许多领域。大多数说话人识别系统已经培训了正常的清洁记录，但这些说话人识别系统的性能趋于识别语音其中有情绪的时候降低。本文提出一种情感说话人识别系统使用机器和使用时间，频率和情感语音和宋（RAVDESS）的瑞尔森视听数据库获得的情感语音数据库的光谱特征深度学习算法训练。我们训练和比较五个机器学习模型（Logistic回归，支持向量机，随机森林，XGBoost，和K近邻），以及三个深度学习模式（长短时记忆网络，多层感知和卷积神经性能网络）。模型的评估后，深层神经网络相比，达到92％的最高精确度的语音信号跑赢情绪扬声器检测的先进设备，最先进的型号机器学习模型表现出良好的性能。

著录项

来源
《International Multidisciplinary Information Technology and Engineering Conference》|2020年|1-8|共8页
会议地点
作者
Tshephisho Joseph Sefara; Tumisho Billson Mokgonyane;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Support vector machines; Emotion recognition; Machine learning algorithms; Databases; Speech recognition; Speaker recognition;

机译：深入学习;支持向量机;情感识别;机器学习算法;数据库;语音识别;扬声器识别;

相似文献

外文文献
中文文献
专利

1. Text-independent speaker recognition based on adaptive course learning loss and deep residual network [J] . Zhong Qinghua, Dai Ruining, Zhang Han, EURASIP journal on advances in signal processing . 2021,第a期

机译：基于自适应课程学习损失和深度剩余网络的文本独立扬声器识别
2. Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation [J] . Hiroshi SEKI, Kazumasa YAMAMOTO, Tomoyosi AKIBA, IEICE transactions on information and systems . 2019,第2期

机译：基于深度神经网络的说话人自适应语音识别的判别学习
3. Speaker Recognition Using Machine Learning Based Method [J] . Vaibhav Bhardwaj, Manish Sharma International Journal of Engineering Research and Applications . 2019,第9S1期

机译：基于机器学习的方法的说话人识别
4. Emotional Speaker Recognition Based on Model Space Migration through Translated Learning* [C] . Li Chen, Yingchun Yang Chinese Conference on biometric recognition . 2013

机译：基于模型空间迁移学习的情感说话人识别*
5. Medical Signal Recognition System Based on Machine Learning, Deep Learning, and Internet of Things [D] . Taqi, Arwa Mohammed . 2021

机译：基于机器学习，深度学习和互联网的医学信号识别系统
6. Physical Activity Recognition Based on a Parallel Approach for an Ensemble of Machine Learning and Deep Learning Classifiers [O] . Mariem Abid, Amal Khabou, Youssef Ouakrim, 2021

机译：基于机器学习和深度学习分类的集合的平行方法的身体活动识别
7. Learning Polynomial Function Based Neutral-Emotion GMM Transformation for Emotional Speaker Recognition [O] . Zhenyu Shan, Yingchun Yang 2012

机译：基于多项式函数的中性情绪Gmm变换在情绪说话人识别中的应用

Emotional Speaker Recognition based on Machine and Deep Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅