Discriminative Neural Clustering for Speaker Diarisation

机译：扬声器日益改血的鉴别性神经聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data clustering with a maximum number of clusters as a supervised sequence-to-sequence learning problem. Com-pared to traditional unsupervised clustering algorithms, DNC learns clustering patterns from training data without requiring an explicit definition of a similarity measure. An implementation of DNC based on the Transformer architecture is shown to be effective on a speaker diarisation task using the challenging AMI dataset. Since AMI contains only 147 complete meetings as individual input sequences, data scarcity is a significant issue for training a Transformer model for DNC. Accordingly, this paper proposes three data augmentation schemes: sub-sequence randomisation, input vector randomisation, and Diaconis augmentation, which generates new data samples by rotating the entire input sequence of L2-normalised speaker embeddings. Experimental results on AMI show that DNC achieves a reduction in speaker error rate (SER) of 29.4% relative to spectral clustering.

机译：在本文中，我们提出了判别神经聚类（DNC），其与最大数量的群集作为监督序列的学习问题制定数据聚类。 COM-PERED对传统的无监督聚类算法，DNC从训练数据中了解群集模式而不需要明确定义相似度测量。基于变压器架构的DNC的实现显示在使用挑战AMI数据集上对扬声器仿真任务有效。由于AMI仅包含147个完整会议作为单独的输入序列，因此数据稀缺是培训DNC变压器模型的重要问题。因此，本文提出了三种数据增强方案：子序列随机化，输入载量随机化和二励志增强，通过旋转L的整个输入序列来产生新的数据样本 2 - 正常的扬声器嵌入。 AMI的实验结果表明，DNC相对于光谱聚类，DNC实现了29.4％的扬声器错误率（SER）的降低。

著录项

来源
《Spoken Language Technology Workshop》|2021年|574-581|共8页
会议地点
作者
Qiujia Li; Florian L. Kreyssig; Chao Zhang; Philip C. Woodland;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Error analysis; Neural networks; Clustering algorithms; Training data; Data models; Task analysis;

机译：训练;错误分析;神经网络;聚类算法;培训数据;数据模型;任务分析;
入库时间 2022-08-26 13:52:51

相似文献

外文文献
中文文献
专利

1. Speaker overlap detection with prosodic features for speaker diarisation [J] . Zelenak M., Hernando J. Signal Processing, IET . 2012,第8期

机译：具有韵律特征的说话人重叠检测，可实现说话人区分
2. Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation [J] . Hiroshi SEKI, Kazumasa YAMAMOTO, Tomoyosi AKIBA, IEICE transactions on information and systems . 2019,第2期

机译：基于深度神经网络的说话人自适应语音识别的判别学习
3. Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification [J] . Wang Shuai, Huang Zili, Qian Yanmin, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第11期

机译：区分性神经嵌入学习用于短时文本无关的说话人验证
4. DNN-based speaker clustering for speaker diarisation [C] . Rosanna Milner, Thomas Hain Annual Conference of the International Speech Communication Association . 2016

机译：基于DNN的扬声器聚类为扬声器估算
5. Discriminative and generative approaches for long- and short-term speaker characteristics modeling: Application to speaker verification. [D] . Dehak, Najim. 2009

机译：长期和短期说话者特征建模的判别和生成方法：在说话者验证中的应用。
6. Neural-Colony Forming Cell Assay: An Assay To Discriminate Bona Fide Neural Stem Cells from Neural Progenitor Cells [O] . Hassan Azari, Sharon A. Louis, Sharareh Sharififar, 2011

机译：神经集落形成细胞测定：一种区分善意神经干细胞与神经祖细胞的测定
7. Discriminative Neural Clustering for Speaker Diarisation [O] . Qiujia Li, Florian L. Kreyssig, Chao Zhang, 2021

机译：扬声器日益改血的鉴别性神经聚类

Discriminative Neural Clustering for Speaker Diarisation

摘要

著录项

相似文献

相关主题

期刊订阅