首页> 外文期刊>IEEE transactions on audio, speech and language processing >Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation
【24h】

Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

机译:语音源和语音相关特征对说话人分割的辨别力

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents an analysis of the speaker discrimination power of vocal source related features, in comparison to the conventional vocal tract related features. The vocal source features, named wavelet octave coefficients of residues (WOCOR), are extracted by pitch-synchronous wavelet transform of the linear predictive (LP) residual signals. Using a series of controlled experiments, it is shown that WOCOR is less sensitive to spoken content than the conventional MFCC features and thus more discriminative when the amount of training data is limited. These advantages of WOCOR are exploited in the task of speaker segmentation for telephone conversation, in which statistical speaker models need to be built upon short speech segments. Experimental results show that the proposed use of WOCOR leads to noticeable reduction of segmentation errors.
机译:与传统的声道相关特征相比,本文对声源相关特征的说话人辨别力进行了分析。通过线性预测(LP)残差信号的音高同步小波变换提取语音源特征,称为残差的小波八度音阶系数(WOCOR)。使用一系列受控实验,表明WOCOR对语音内容的敏感性不如常规MFCC功能,因此,在训练数据量有限的情况下,更具判别力。 WOCOR的这些优点在电话交谈的说话人分割任务中得到了利用,其中统计的说话人模型需要建立在简短的语音片段上。实验结果表明,WOCOR的建议使用可显着减少分割错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号