Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models

机译：基于特征语音模型的基于交叉似然比的说话人聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

机译：本文提出使用特征语音建模技术，将交叉似然比（CLR）作为说话人二分系统中说话人聚类的标准。先前已证明CLR是使用高斯混合模型进行说话人聚类的可靠决策标准。近来，特征语音建模技术由于其能够基于稀疏的训练数据充分代表说话者的能力以及对说话者特征差异的更好捕获而变得越来越流行。因此，本文提出在CLR框架中利用特征语音建模的优势将是有益的。在2002 Rich Transcription（RT-02）评估数据集上获得的结果显示出改进的聚类性能，与基线系统相比，总体Diarization Error Rate（DER）相对提高了35.1％。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.964-967|共4页
会议地点
作者
D. Wang; R. Vogt; S. Sridharan; D. Dean;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
eigenvoice modeling; joint factor analysis; cross likelihood ratio; speaker clustering; speaker diarization;

机译：特征语音建模;联合因素分析;交叉似然比;说话者聚类;说话人差异化;

相似文献

外文文献
中文文献
专利

1. Eigenvoice modelling for cross likelihood ratio based speaker clustering: A Bayesian approach [J] . David Wang, Robert Vogt, Sridha Sridharan Computer speech and language . 2013,第4期

机译：基于交叉似然比的说话人聚类的特征语音建模：贝叶斯方法
2. ENHANCEMENTS OF MAXIMUM LIKELIHOOD EIGEN-DECOMPOSITION USING FUZZY LOGIC CONTROL FOR EIGENVOICE-BASED SPEAKER ADAPTATION [J] . Ing-Jr. Ding International Journal of Innovative Computing Information and Control . 2011,第7B期

机译：基于特征语音的说话人自适应的模糊逻辑控制最大似然分解
3. Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data [J] . Seyyed Saeed Sarfjoo, Cenk Demiroğlu, Simon King Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第4期

机译：在有限数据的基于HMM的跨语言说话者适应中使用特征语音和最近邻
4. CLUSTERING SPEECH UTTERANCES BY SPEAKER USING EIGENVOICE-MOTIVATED VECTOR SPACE MODELS [C] . Wei-Ho Tsai, Shih-Sian Cheng, Yi-Hsiang Chao, IEEE International Conference on Acoustics, Speech, and Signal Processing . 2005

机译：使用特征性激励矢量空间模型的扬声器聚类语音词典
5. Generalized fixed effect models and likelihood based clustering in codon substitution model. [D] . Bao, Le. 2005

机译：密码子替代模型中的广义固定效应模型和基于似然性的聚类。
6. A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers [O] . Andrea S. Foulkes, Recai Yucel, Xiaohong Li -1

机译：基于似然性的群集标识符混合建模的方法
7. Eigenvoice modeling for cross likelihood ratio based speaker clustering : a Bayesian approach [O] . Wang David, Vogt Robert J., Sridharan Sridha 2013

机译：基于交叉似然比的说话人聚类的特征语音建模：贝叶斯方法

Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models

摘要

著录项

相似文献

相关主题

期刊订阅