...
首页> 外文期刊>International journal of speech technology >Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition
【24h】

Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

机译:根据数据集的属性选择分类器:案例研究-语音情感识别

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n-number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (μ) and standard deviation (σ) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions.
机译:由于很难对n个分类器上的数据进行实验,因此设计了一种基于数据集属性选择分类器的过程。作为案例研究,考虑了语音情感识别。探索与情绪有关的频谱特征和韵律特征的不同组合。根据所选数据集的属性,为每个分类器推荐所选功能集的最佳子集。各种统计检验已用于估计数据集的属性。数据集的性质给出了选择相关分类器的想法。为了更精确,将其他三种聚类和分类技术(例如K-means聚类,矢量量化和人工神经网络)用于实验,并将结果与​​所选分类器进行比较。在这项工作中考虑了韵律特征,例如音调,强度,抖动,闪光,频谱特征(例如梅尔频率倒谱系数(MFCC)和共振峰)。从语音中提取韵律的统计参数,例如最小,最大,均值(μ)和标准偏差(σ),并与基本频谱(MFCC)功能结合使用以获得更好的性能。考虑了五个基本情绪,即愤怒,恐惧,幸福,中立和悲伤。为了分析不同分类器上不同数据集的表现,使用了从泰卢固语电影中收集的内容和与说话者无关的情感数据。收集五十个用户的平均意见分数来标记情感数据。为了使其更准确,使用了基准的IIT-Kharagpur情感数据库之一来概括结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号