首页> 外文会议>Spoken Language Technology Workshop >Multi-Class Spectral Clustering with Overlaps for Speaker Diarization
【24h】

Multi-Class Spectral Clustering with Overlaps for Speaker Diarization

机译:具有与扬声器日期重叠重叠的多级光谱聚类

获取原文

摘要

This paper describes a method for overlap-aware speaker diarization. Given an overlap detector and a speaker embedding extractor, our method performs spectral clustering of segments informed by the output of the overlap detector. This is achieved by transforming the discrete clustering problem into a convex optimization problem which is solved by eigen-decomposition. Thereafter, we discretize the solution by alternatively using singular value decomposition and a modified version of non-maximal suppression which is constrained by the output of the overlap detector. Furthermore, we detail an HMM-DNN based overlap detector which performs frame-level classification and enforces duration constraints through HMM state transitions. Our method achieves a test diarization error rate (DER) of 24.0% on the mixed-headset setting of the AMI meeting corpus, which is a relative improvement of 15.2% over a strong agglomerative hierarchical clustering baseline, and compares favorably with other overlap-aware diarization methods. Further analysis on the LibriCSS data demonstrates the effectiveness of the proposed method in high overlap conditions.
机译:本文介绍了一种重叠感知扬声器日益升级的方法。考虑到重叠检测器和扬声器嵌入提取器,我们的方法执行通过重叠检测器的输出通知的段的频谱聚类。这是通过将离散聚类问题转换成凸优化问题来实现的,这是通过特征分解解决的凸优化问题。此后,我们通过唯一值分解和改进的非最大抑制的修改版本来离散解决方案,这受重叠检测器的输出限制。此外,我们详细介绍了基于HMM-DNN的重叠检测器,其执行帧级分类,并通过HMM状态转换来强制执行持续时间约束。我们的方法实现了24.0%的AMI会议文集,这是15.2%以上强烈凝聚层次聚类基线的相对改善的混合耳机设置一个测试diarization错误率(DER),并与其他重叠感知逊色润肤方法。对Liblics数据的进一步分析显示了在高重叠条件下提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号