首页> 外文期刊>Computer speech and language >Speaker state classification based on fusion of asymmetric simple partial least squares (SIMPLS) and support vector machines
【24h】

Speaker state classification based on fusion of asymmetric simple partial least squares (SIMPLS) and support vector machines

机译:基于非对称简单最小二乘和支持向量机融合的说话人状态分类

获取原文
获取原文并翻译 | 示例

摘要

This paper presents our studies of the effects of acoustic features, speaker normalization methods, and statistical modeling techniques on speaker state classification. We focus on the investigation of the effect of simple partial least squares (SIMPLS) in unbalanced binary classification. Beyond dimension reduction and low computational complexity, SIMPLS classifier (SIMPLSC) shows, especially, higher prediction accuracy to the class with the smaller data number. Therefore, an asymmetric SIMPLS classifier (ASIMPLSC) is proposed to enhance the performance of SIMPLSC to the class with the larger data number. Furthermore, we combine multiple system outputs (ASIMPLS classifier and Support Vector Machines) by score-level fusion to exploit the complementary information in diverse systems. The proposed speaker state classification system is evaluated with several experiments on unbalanced data sets. Within the Interspeech 2011 Speaker State Challenge, we could achieve the best results for the 2-class task of the Sleepiness Sub-Challenge with an unweighted average recall of 71.7%. Further experimental results on the SEMAINE data sets show that the ASIMPLSC achieves an absolute improvement of 6.1%, 6.1%, 24.5%, and 1.3% on the weighted average recall value, over the AVEC 2011 baseline system on the emotional speech binary classification tasks of four dimensions, namely, activation, expectation, power, and valence, respectively.
机译:本文介绍了我们对声学特征,说话人归一化方法和统计建模技术对说话人状态分类的影响的研究。我们专注于研究简单偏最小二乘(SIMPLS)在不平衡二进制分类中的影响。除了降维和降低计算复杂度外,SIMPLS分类器(SIMPLSC)特别显示出具有较小数据量的类别的较高预测精度。因此,提出了一种非对称SIMPLS分类器(ASIMPLSC),以将SIMPLSC的性能提高到数据量较大的类。此外,我们通过分数级融合来组合多个系统输出(ASIMPLS分类器和支持向量机),以利用各种系统中的互补信息。拟议的演讲者状态分类系统是通过对不平衡数据集的几次实验进行评估的。在2011年Interspeech演讲者状态挑战赛中,我们可以使“困倦”子挑战的2级任务获得最佳结果,未加权平均召回率为71.7%。在SEMAINE数据集上的进一步实验结果表明,相比于AVEC 2011基线系统,ASIMPLSC在情感语音二进制分类任务上的加权平均召回值绝对提高了6.1%,6.1%,24.5%和1.3%。四个维度,分别是激活,期望,能力和价。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号