首页> 外文期刊>Computer speech and language >Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks
【24h】

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks

机译:扬声器日复速病中X-Vector序列(VBX)的Bayesian HMM聚类:理论,实施和标准任务的实施和分析

获取原文
获取原文并翻译 | 示例

摘要

The recently proposed VBx diarization method uses a Bayesian hidden Markov model to find speaker clusters in a sequence of x-vectors. In this work we perform an extensive comparison of performance of the VBx diarization with other approaches in the literature and we show that VBx achieves superior performance on three of the most popular datasets for evaluating diarization: CALLHOME, AMI and DIHARD Ⅱ datasets. Further, we present for the first time the derivation and update formulae for the VBx model, focusing on the efficiency and simplicity of this model as compared to the previous and more complex BHMM model working on frame-by-frame standard Cepstral features. Together with this publication, we release the recipe for training the x-vector extractors used in our experiments on both wide and narrowband data, and the VBx recipes that attain state-of-the-art performance on all three datasets. Besides, we point out the lack of a standardized evaluation protocol for AMI dataset and we propose a new protocol for both Beamformed and Mix-Headset audios based on the official AMI partitions and transcriptions.
机译:最近提出的VBX日记方法使用贝叶斯隐马尔可夫模型来查找一系列X型载体的扬声器簇。在这项工作中,我们对文献中的其他方法进行了广泛的比较,我们表明VBX在三个最流行的数据集中实现了卓越的性能,用于评估日益衰退:CallHome,AMI和DihardⅡ数据集。此外,我们首次出现了VBX模型的推导和更新公式,其专注于该模型的效率和简单,而与在逐帧框架标准临床特征上工作的先前和更复杂的BHMM模型相比。与本出版物一起,我们释放了用于培训我们在我们的实验中使用的X-Vector提取器的配方,以及在所有三个数据集上达到最先进的性能的VBX食谱。此外,我们指出了缺乏AMI DataSet的标准化评估协议,我们为基于官方AMI分区和转录的Beafformed和Mix-耳机Audios提出了一种新的协议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号