Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization

机译：自动群集复杂度和数量选择：实现鲁棒的说话人区分

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The goal of speaker diarization is to determine where each participant speaks in a recording. One of the most commonly used technique is agglomerative clustering, where some number of initial models are grouped into the number of present speakers. The choice of complexity, topology, and the number of initial models is vital to the final outcome of the clustering algorithm. In prior systems, these parameters were directly assigned based on development data, and were the same for all recordings. In this paper we present three techniques to select the parameters individually for each case, obtaining a system that is more robust to changes in the data. Although the choice of these values depends on tunable parameters, they are less sensitive to changes in the acoustic data and to how the algorithm distributes data among the different clusters. We show that by using the three techniques, we achieve an improvement up to 8% relative in the development set and 19% relative in the test set over prior systems.

机译：说话者差异化的目的是确定每个参与者在录音中的讲话位置。最常用的技术之一是聚集聚类，其中一些初始模型被分组为当前说话者的数量。复杂度，拓扑和初始模型数量的选择对于聚类算法的最终结果至关重要。在现有系统中，这些参数是根据开发数据直接分配的，并且对于所有记录都是相同的。在本文中，我们提出了三种技术来分别为每种情况选择参数，从而获得对数据变化更健壮的系统。尽管这些值的选择取决于可调参数，但它们对声学数据的变化以及算法在不同群集之间分配数据的方式不太敏感。我们表明，使用这三种技术，与现有系统相比，我们的开发集和测试集的相对改进分别达到8％和19％。

著录项

来源
《Machine learning for multimodal interaction》|2006年|248-256|共9页
会议地点 Bethesda MD(US);Bethesda MD(US)
作者
Xavier Anguera; Chuck Wooters; Javier Hernando;
展开▼
作者单位

International Computer Science Institute, Berkeley CA 94704, USA Technical University of Catalonia, Barcelona, Spain;

International Computer Science Institute, Berkeley CA 94704, USA;

Technical University of Catalonia, Barcelona, Spain;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言;
关键词

相似文献

外文文献
中文文献
专利

1. Strategies to Improve the Robustness of Agglomerative Hierarchical Clustering Under Data Source Variation for Speaker Diarization [J] . Han K.J., Kim S., Narayanan S.S. IEEE transactions on audio, speech and language processing . 2008,第8期

机译：数据源变化下说话者差异化下提高聚集层次聚类鲁棒性的策略
2. Hybridization DE with K-means for speaker clustering in speaker diarization of broadcasts news [J] . Dabbabi Karim, Hajji Salah, Cherif Adnen International journal of speech technology . 2019,第4期

机译：与K-means的混合DE用于演讲者广播新闻的演讲者聚类
3. An overview of automatic speaker diarization systems [J] . Tranter S.E., Reynolds D.A. IEEE transactions on audio, speech and language processing . 2006,第5期

机译：扬声器自动扩音系统概述
4. Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization [C] . Xavier Anguera, Chuck Wooters, Javier Hernando International workshop on machine learning for multimodal interaction . 2006

机译：自动集群复杂性和数量选择：朝向强大的扬声器深度化
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Supervised Speaker Diarization Using Random Forests: A Tool for Psychotherapy Process Research [O] . Lukas Fürer, Nathalie Schenk, Volker Roth, 2020

机译：使用随机森林监督扬声器日期：一种心理治疗过程研究的工具
7. Automatic cluster complexity and quantity selection: Towards robust speaker diarization [O] . Xavier Anguera, Chuck Wooters, Javier Hern 2006

机译：自动集群复杂性和数量选择：实现强大的扬声器分类
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization

摘要

著录项

相似文献

相关主题

期刊订阅