首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models
【24h】

Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models

机译:基于特征语音模型的基于交叉似然比的说话人聚类

获取原文

摘要

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.
机译:本文提出使用特征语音建模技术,将交叉似然比(CLR)作为说话人二分系统中说话人聚类的标准。先前已证明CLR是使用高斯混合模型进行说话人聚类的可靠决策标准。近来,特征语音建模技术由于其能够基于稀疏的训练数据充分代表说话者的能力以及对说话者特征差异的更好捕获而变得越来越流行。因此,本文提出在CLR框架中利用特征语音建模的优势将是有益的。在2002 Rich Transcription(RT-02)评估数据集上获得的结果显示出改进的聚类性能,与基线系统相比,总体Diarization Error Rate(DER)相对提高了35.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号