Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach

Shum; S.H.; Dehak; N.; Dehak; R.; Glass; J.R.

首页> 外文期刊>IEEE transactions on audio, speech and language processing >Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach

【24h】

Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach

机译：说话人差异化的无监督方法：集成和迭代方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In speaker diarization, standard approaches typically perform speaker clustering on some initial segmentation before refining the segment boundaries in a re-segmentation step to obtain a final diarization hypothesis. In this paper, we integrate an improved clustering method with an existing re-segmentation algorithm and, in iterative fashion, optimize both speaker cluster assignments and segmentation boundaries jointly. For clustering, we extend our previous research using factor analysis for speaker modeling. In continuing to take advantage of the effectiveness of factor analysis as a front-end for extracting speaker-specific features (i.e., i-vectors), we develop a probabilistic approach to speaker clustering by applying a Bayesian Gaussian Mixture Model (GMM) to principal component analysis (PCA)-processed i-vectors. We then utilize information at different temporal resolutions to arrive at an iterative optimization scheme that, in alternating between clustering and re-segmentation steps, demonstrates the ability to improve both speaker cluster assignments and segmentation boundaries in an unsupervised manner. Our proposed methods attain results that are comparable to those of a state-of-the-art benchmark set on the multi-speaker CallHome telephone corpus. We further compare our system with a Bayesian nonparametric approach to diarization and attempt to reconcile their differences in both methodology and performance.

机译：在说话人歧化中，标准方法通常在重新分割步骤中细化段边界之前，对某个初始分割执行说话者聚类以获得最终的区分假设。在本文中，我们将改进的聚类方法与现有的重新分段算法集成在一起，并以迭代的方式共同优化说话人群集分配和分割边界。对于聚类，我们将先前的研究扩展到使用因子分析的说话人建模中。为了继续利用因素分析作为提取说话人特定特征（即i向量）的前端的有效性，我们通过对贝叶斯主体应用贝叶斯高斯混合模型（GMM），开发了一种概率方法来进行说话人聚类组件分析（PCA）处理的i向量。然后，我们利用不同时间分辨率的信息来得出一个迭代优化方案，该方案在聚类和重新细分步骤之间交替显示了以无人监督的方式改善说话者聚类分配和分割边界的能力。我们提出的方法所获得的结果可与多扬声器CallHome电话语料库上的最新基准进行比较。我们进一步将我们的系统与贝叶斯非参数化方法进行比较，并尝试调和它们在方法和性能上的差异。

著录项

来源
《IEEE transactions on audio, speech and language processing》 |2013年第10期|2015-2028|共14页
作者
Shum; S.H.; Dehak; N.; Dehak; R.; Glass; J.R.;
展开▼
作者单位

MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA|c|;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Bayesian nonparametric inference; HDP-HMM; factor analysis; i-vectors; principal component analysis; speaker clustering; speaker diarization; spectral clustering; variational Bayes;

机译：贝叶斯非参数推理;HDP-HMM;因子分析;i-向量;主成分分析;扬声器聚类;扬声器二值化;频谱聚类;变贝叶斯;

相似文献

外文文献
中文文献
专利

1. Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system [J] . Teimoori Farshad, Razzazi Farbod Multimedia Tools and Applications . 2019,第9期

机译：说话人区分系统中未经监督的，经过训练的基于LS-SVR的分割
2. Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system [J] . Teimoori Farshad, Razzazi Farbod Multimedia Tools and Applications . 2019,第9期

机译：扬声器深度化系统中无监督的帮助训练的LS-SVR系列
3. Unsupervised deep feature embeddings for speaker diarization [J] . Rehan AHMAD, Syed ZUBAIR Turkish Journal of Electrical Engineering and Computer Sciences . 2019,第4期

机译：扬声器日益改估无监督的深度特征嵌入
4. On the Use of Spectral and Iterative Methods for Speaker Diarization [C] . Stephen Shum, Najim Dehak, Jim Glass Annual conference of the International Speech Communication Association . 2012

机译：频谱和迭代方法在说话人二值化中的应用
5. Information Artifact Evaluation and Iterative Design: A Novel Mixed-Method Approach [D] . Carlson, Timothy S. 2017

机译：信息工件评估和迭代设计：一种新颖的混合方法
6. A Framework for Anchor Methods and an Iterative Forward Approach forDIF Detection [O] . Julia Kopf, Achim Zeileis, Carolin Strobl 2015

机译：锚定方法的框架和迭代递推方法DIF检测
7. Unsupervised methods for speaker diarization: An integrated and iterative approach [O] . Stephen H. Shum, Student Member, Najim Dehak, 2015

机译：用于说话人日记的无监督方法：一种综合的迭代方法

Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅