首页> 外文会议>Spoken Language Technology Workshop >Discriminative Neural Clustering for Speaker Diarisation
【24h】

Discriminative Neural Clustering for Speaker Diarisation

机译:扬声器日益改血的鉴别性神经聚类

获取原文

摘要

In this paper, we propose Discriminative Neural Clustering (DNC) that formulates data clustering with a maximum number of clusters as a supervised sequence-to-sequence learning problem. Com-pared to traditional unsupervised clustering algorithms, DNC learns clustering patterns from training data without requiring an explicit definition of a similarity measure. An implementation of DNC based on the Transformer architecture is shown to be effective on a speaker diarisation task using the challenging AMI dataset. Since AMI contains only 147 complete meetings as individual input sequences, data scarcity is a significant issue for training a Transformer model for DNC. Accordingly, this paper proposes three data augmentation schemes: sub-sequence randomisation, input vector randomisation, and Diaconis augmentation, which generates new data samples by rotating the entire input sequence of L2-normalised speaker embeddings. Experimental results on AMI show that DNC achieves a reduction in speaker error rate (SER) of 29.4% relative to spectral clustering.
机译:在本文中,我们提出了判别神经聚类(DNC),其与最大数量的群集作为监督序列的学习问题制定数据聚类。 COM-PERED对传统的无监督聚类算法,DNC从训练数据中了解群集模式而不需要明确定义相似度测量。基于变压器架构的DNC的实现显示在使用挑战AMI数据集上对扬声器仿真任务有效。由于AMI仅包含147个完整会议作为单独的输入序列,因此数据稀缺是培训DNC变压器模型的重要问题。因此,本文提出了三种数据增强方案:子序列随机化,输入载量随机化和二励志增强,通过旋转L的整个输入序列来产生新的数据样本 2 - 正常的扬声器嵌入。 AMI的实验结果表明,DNC相对于光谱聚类,DNC实现了29.4%的扬声器错误率(SER)的降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号