...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization
【24h】

A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization

机译:自下而上和自上而下的说话人差异化方法的比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a theoretical framework to analyze the relative merits of the two most general, dominant approaches to speaker diarization involving bottom-up and top-down hierarchical clustering. We present an original qualitative comparison which argues how the two approaches are likely to exhibit different behavior in speaker inventory optimization and model training: bottom-up approaches will capture comparatively purer models and will thus be more sensitive to nuisance variation such as that related to the speech content; top-down approaches, in contrast, will produce less discriminative speaker models but, importantly, models which are potentially better normalized against nuisance variation. We report experiments conducted on two standard, single-channel NIST RT evaluation datasets which validate our hypotheses. Results show that competitive performance can be achieved with both bottom-up and top-down approaches (average DERs of 21% and 22%), and that neither approach is superior. Speaker purification, which aims to improve speaker discrimination, gives more consistent improvements with the top-down system than with the bottom-up system (average DERs of 19% and 25%), thereby confirming that the top-down system is less discriminative and that the bottom-up system is less stable. Finally, we report a new combination strategy that exploits the merits of the two approaches. Combination delivers an average DER of 17% and confirms the intrinsic complementary of the two approaches.
机译:本文提供了一个理论框架,用于分析涉及自下而上和自上而下的层次聚类的两种最普遍,占主导地位的说话人二分法的相对优点。我们提出了一个原始的定性比较,认为这两种方法在说话者清单优化和模型训练中如何表现出不同的行为:自下而上的方法将捕获相对较纯的模型,因此对扰动变化(例如与噪声相关的变化)更敏感。演讲内容;相比之下,自上而下的方法将产生较少的区分性说话人模型,但重要的是,可以针对干扰变化更好地进行标准化的模型。我们报告了在两个标准的单通道NIST RT评估数据集上进行的实验,这些数据验证了我们的假设。结果表明,使用自下而上和自上而下的方法(平均DER分别为21%和22%)都可以实现竞争绩效,而且这两种方法都不是更好的方法。旨在改善扬声器辨别力的扬声器净化技术,与自下而上系统相比,自上而下系统提供了更一致的改进(平均DERs为19%和25%),从而证实了自上而下系统的判别力和自下而上的系统不稳定。最后,我们报告了一种新的组合策略,该策略利用了两种方法的优点。组合平均可提供17%的DER,并证实了这两种方法的内在互补性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号