...
首页> 外文期刊>Computer speech and language >A study of speaker clustering for speaker attribution in large telephone conversation datasets
【24h】

A study of speaker clustering for speaker attribution in large telephone conversation datasets

机译:大型电话会话数据集中说话人归因的说话人聚类研究

获取原文
获取原文并翻译 | 示例
           

摘要

This paper proposes the task of speaker attribution as speaker diarization followed by speaker linking. The aim of attribution is to identify and label common speakers across multiple recordings. To do this, it is necessary to first carry out diarization to obtain speaker-homogeneous segments from each recording. Speaker linking can then be conducted to link common speaker identities across multiple inter-session recordings. This process can be extremely inefficient using the traditional agglomerative cluster merging and retraining commonly employed in diarization. We thus propose an attribution system using complete-linkage clustering (CLC) without model retraining. We show that on top of the efficiency gained through elimination of the retraining phase, greater accuracy is achieved by utilizing the farthest-neighbor criterion inherent to CLC for both diarization and linking. We first evaluate the use of CLC against an agglomerative clustering (AC) without retraining approach, traditional agglomerative clustering with retraining (ACR) and single-linkage clustering (SLC) for speaker linking. We show that CLC provides a relative improvement of 20%, 29% and 39% in attribution error rate (AER) over the three said approaches, respectively. We then propose a diarization system using CLC and show that it outperforms AC, ACR and SLC with relative improvements of 32%, 50% and 70% in diarization error rate (DER), respectively. In our work, we employ the cross-likelihood ratio (CLR) as the model comparison metric for clustering and investigate its robustness as a stopping criterion for attribution.
机译:本文提出了说话人归因的任务,即说话人区分和说话人链接。归因的目的是在多个录音中识别并标记普通讲话者。为此,必须首先进行数字化处理,以从每个记录中获得说话人同质的片段。然后可以进行演讲者链接,以跨多个会话间记录链接常见的演讲者身份。使用传统的聚集集群合并和再训练通常会在效率上效率低下,而传统的聚集集群通常是在合并过程中使用的。因此,我们提出了一种使用完全链接聚类(CLC)而不进行模型重新训练的归因系统。我们表明,除了通过消除再培训阶段而获得的效率之外,通过利用CLC固有的最远邻居准则进行扩展和链接,可以实现更高的准确性。我们首先评估针对不使用重新训练方法的团聚集群(AC),使用重新训练的传统团聚集群(ACR)和针对说话者链接的单链接集群(SLC)的CLC的使用。我们表明,与上述三种方法相比,CLC分别将归因错误率(AER)分别提高了20%,29%和39%。然后,我们提出了一种使用CLC的数字化系统,并表明它优于AC,ACR和SLC,其数字化错误率(DER)分别相对提高了32%,50%和70%。在我们的工作中,我们采用交叉似然比(CLR)作为模型比较指标进行聚类,并研究其稳健性作为归因的停止标准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号