首页> 外文会议> >A fast-match approach for robust, faster than real-time speaker diarization
【24h】

A fast-match approach for robust, faster than real-time speaker diarization

机译:一种快速匹配的方法,可实现比实时扬声器数字化更强大,更快的速度

获取原文

摘要

During the past few years, speaker diarization has achieved satisfying accuracy in terms of speaker Diarization Error Rate (DER). The most successful approaches, based on agglomerative clustering, however, exhibit an inherent computational complexity which makes real-time processing, especially in combination with further processing steps, almost impossible. In this article we present a framework to speed up agglomerative clustering speaker diarization. The basic idea is to adopt a computationally cheap method to reduce the hypothesis space of the more expensive and accurate model selection via Bayesian Information Criterion (BIC). Two strategies based on the pitch-correlogram and the unscented-transform based approximation of KL-divergence are used independently as a fast-match approach to select the most likely clusters to merge. We performed the experiments using the existing ICSI speaker diarization system. The new system using KL-divergence fast-match strategy only performs 14% of total BIC comparisons needed in the baseline system, speeds up the system by 41% without affecting the speaker Diarization Error Rate (DER). The result is a robust and faster than real-time speaker diarization system.
机译:在过去的几年中,扬声器二值化在扬声器二值化错误率(DER)方面已达到令人满意的精度。然而,基于聚集聚类的最成功的方法表现出固有的计算复杂性,这使得实时处理(尤其是与其他处理步骤结合使用)几乎是不可能的。在本文中,我们提出了一个框架,以加快聚集聚类说话人的区分。基本思想是采用计算上便宜的方法,以通过贝叶斯信息准则(BIC)减少更昂贵,更准确的模型选择的假设空间。基于音高相关图和基于无味变换的KL散度近似的两种策略被独立地用作快速匹配方法,以选择最可能合并的聚类。我们使用现有的ICSI扬声器扩音系统进行了实验。使用KL散度快速匹配策略的新系统仅执行基线系统所需的总BIC比较的14%,将系统速度提高41%,而不会影响扬声器的Diarization Error Rate(DER)。其结果是比实时扬声器数字化系统更强大,更快速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号