首页> 外文会议>International Multidisciplinary Information Technology and Engineering Conference >Emotional Speaker Recognition based on Machine and Deep Learning
【24h】

Emotional Speaker Recognition based on Machine and Deep Learning

机译:基于机器和深度学习的情绪扬声器识别

获取原文
获取外文期刊封面目录资料

摘要

Speaker recognition is a method which recognise a speaker from characteristics of a voice. Speaker recognition technologies have been widely used in many domains. Most speaker recognition systems have been trained on normal clean recordings, however the performance of these speaker recognition systems tends to degrade when recognising speech which has emotions. This paper presents an emotional speaker recognition system trained using machine and deep learning algorithms using time, frequency and spectral features on emotional speech database acquired from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). We trained and compared the performance of five machine learning models (Logistic Regression, Support Vector Machine, Random Forest, XGBoost, and k-Nearest Neighbor), and three deep learning models (Long Short-Term Memory network, Multilayer Perceptron, and Convolutional Neural Network). After the evaluation of the models, the deep neural networks showed good performance compared to machine learning models by attaining the highest accuracy of 92% outperforming the state-of-the-art models in emotional speaker detection from speech signals.
机译:说话人识别是其识别的扬声器从声音的特性的方法。说话人识别技术已被广泛应用于许多领域。大多数说话人识别系统已经培训了正常的清洁记录,但这些说话人识别系统的性能趋于识别语音其中有情绪的时候降低。本文提出一种情感说话人识别系统使用机器和使用时间,频率和情感语音和宋(RAVDESS)的瑞尔森视听数据库获得的情感语音数据库的光谱特征深度学习算法训练。我们训练和比较五个机器学习模型(Logistic回归,支持向量机,随机森林,XGBoost,和K近邻),以及三个深度学习模式(长短时记忆网络,多层感知和卷积神经性能网络)。模型的评估后,深层神经网络相比,达到92%的最高精确度的语音信号跑赢情绪扬声器检测的先进设备,最先进的型号机器学习模型表现出良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号