首页> 外文会议>International workshop on machine learning for multimodal interaction >Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization
【24h】

Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization

机译:自动集群复杂性和数量选择:朝向强大的扬声器深度化

获取原文

摘要

The goal of speaker diarization is to determine where each participant speaks in a recording. One of the most commonly used technique is agglomerative clustering, where some number of initial models are grouped into the number of present speakers. The choice of complexity, topology, and the number of initial models is vital to the final outcome of the clustering algorithm. In prior systems, these parameters were directly assigned based on development data, and were the same for all recordings. In this paper we present three techniques to select the parameters individually for each case, obtaining a system that is more robust to changes in the data. Although the choice of these values depends on tunable parameters, they are less sensitive to changes in the acoustic data and to how the algorithm distributes data among the different clusters. We show that by using the three techniques, we achieve an improvement up to 8% relative in the development set and 19% relative in the test set over prior systems.
机译:扬声器日益改估的目标是确定每个参与者在录音中发言的位置。其中一个最常用的技术是凝聚聚类,其中一些数量的初始模型被分组为当前扬声器的数量。复杂性,拓扑和初始模型的数量对聚类算法的最终结果至关重要。在先前的系统中,基于开发数据直接分配了这些参数,并且所有录像都是相同的。在本文中,我们提出了三种技术来单独选择每个案例的参数,获取更强大的系统对数据的更改。虽然这些值的选择取决于可调参数,但它们对声学数据的变化不太敏感,以及算法如何在不同群集中分配数据。我们表明,通过使用三种技术,我们在开发集中的相对达到8%的增长率和19%相对于现有系统中的测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号