首页> 外文会议>IEEE Workshop on Automatic Speech Recognition and Understanding >A FAST-MATCH APPROACH FOR ROBUST, FASTER THAN REAL-TIME SPEAKER DIARIZATION
【24h】

A FAST-MATCH APPROACH FOR ROBUST, FASTER THAN REAL-TIME SPEAKER DIARIZATION

机译:快速匹配的稳健方法,比实时扬声器深度更快

获取原文

摘要

During the past few years, speaker diarization has achieved satisfying accuracy in terms of speaker Diarization Error Rate (DER). The most successful approaches, based on agglomerative clustering, however, exhibit an inherent computational complexity which makes real-time processing, especially in combination with further processing steps, almost impossible. In this article we present a framework to speed up agglomerative clustering speaker diarization. The basic idea is to adopt a computationally cheap method to reduce the hypothesis space of the more expensive and accurate model selection via Bayesian Information Criterion (BIC). Two strategies based on the pitch-correlogram and the unscented-trans-form based approximation of KL-divergence are used independently as a fast-match approach to select the most likely clusters to merge. We performed the experiments using the existing ICSI speaker diarization system. The new system using KL-divergence fast-match strategy only performs 14% of total BIC comparisons needed in the baseline system, speeds up the system by 41% without affecting the speaker Diarization Error Rate (DER). The result is a robust and faster than real-time speaker diarization system.
机译:在过去几年中,在扬声器日益改估误差率(DER)方面取得了令人满意的准确性。然而,基于附聚类聚类的最成功的方法表现出固有的计算复杂性,该复杂性使得实时处理,特别是与进一步的处理步骤组合,几乎不可能。在本文中,我们提出了一种框架,加快了凝聚聚类扬声器日期。基本思想是通过计算廉价的方法来减少经由贝叶斯信息标准(BIC)更昂贵和准确的模型选择的假设空间。基于间距相关的两种策略和基于kL分歧的近似的vercented-trans形的近似是独立使用的,作为快速匹配的方法来选择最有可能的群集合并。我们使用现有的ICSI扬声器深度化系统进行了实验。使用KL-Divergence快速匹配策略的新系统仅执行基线系统所需的总BIC比较的14%,在不影响扬声器深度误差率(DER)的情况下将系统加速41%。结果是比实时扬声器日复速动系统更强大,更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号