A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization

Evans N.; Bozonnet S.; Dong Wang; Fredouille C.; Troncy R.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization

【24h】

A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization

机译：自下而上和自上而下的说话人差异化方法的比较研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a theoretical framework to analyze the relative merits of the two most general, dominant approaches to speaker diarization involving bottom-up and top-down hierarchical clustering. We present an original qualitative comparison which argues how the two approaches are likely to exhibit different behavior in speaker inventory optimization and model training: bottom-up approaches will capture comparatively purer models and will thus be more sensitive to nuisance variation such as that related to the speech content; top-down approaches, in contrast, will produce less discriminative speaker models but, importantly, models which are potentially better normalized against nuisance variation. We report experiments conducted on two standard, single-channel NIST RT evaluation datasets which validate our hypotheses. Results show that competitive performance can be achieved with both bottom-up and top-down approaches (average DERs of 21% and 22%), and that neither approach is superior. Speaker purification, which aims to improve speaker discrimination, gives more consistent improvements with the top-down system than with the bottom-up system (average DERs of 19% and 25%), thereby confirming that the top-down system is less discriminative and that the bottom-up system is less stable. Finally, we report a new combination strategy that exploits the merits of the two approaches. Combination delivers an average DER of 17% and confirms the intrinsic complementary of the two approaches.

机译：本文提供了一个理论框架，用于分析涉及自下而上和自上而下的层次聚类的两种最普遍，占主导地位的说话人二分法的相对优点。我们提出了一个原始的定性比较，认为这两种方法在说话者清单优化和模型训练中如何表现出不同的行为：自下而上的方法将捕获相对较纯的模型，因此对扰动变化（例如与噪声相关的变化）更敏感。演讲内容；相比之下，自上而下的方法将产生较少的区分性说话人模型，但重要的是，可以针对干扰变化更好地进行标准化的模型。我们报告了在两个标准的单通道NIST RT评估数据集上进行的实验，这些数据验证了我们的假设。结果表明，使用自下而上和自上而下的方法（平均DER分别为21％和22％）都可以实现竞争绩效，而且这两种方法都不是更好的方法。旨在改善扬声器辨别力的扬声器净化技术，与自下而上系统相比，自上而下系统提供了更一致的改进（平均DERs为19％和25％），从而证实了自上而下系统的判别力和自下而上的系统不稳定。最后，我们报告了一种新的组合策略，该策略利用了两种方法的优点。组合平均可提供17％的DER，并证实了这两种方法的内在互补性。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2012年第2期|p.382-392|共11页
作者
Evans N.; Bozonnet S.; Dong Wang; Fredouille C.; Troncy R.;
展开▼
作者单位

Dept. of Multimedia Commun., EURECOM, Sophia Antipolis, France;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering; rich transcription; segmentation; speaker diarization;

机译：聚类;丰富的转录;片段化;说话者二叉化;

相似文献

外文文献
中文文献
专利

1. A comparative study of top-down and bottom-up approaches for the preparation of nanosuspensions of glipizide [J] . Koneti Venkata Mahesh, Sachin Kumar Singh, Monica Gulati Powder Technology: An International Journal on the Science and Technology of Wet and Dry Particulate Systems . 2014,第Null期

机译：自上而下和自下而上方法制备格列吡嗪纳米悬浮液的比较研究
2. A comparative study of top-down and bottom-up approaches for the preparation of microanosuspensions. [J] . Verma S, Gokhale R, Burgess DJ International Journal of Pharmaceutics . 2009,第1a2期

机译：自上而下和自下而上的制备微量/纳米悬浮液的方法的比较研究。
3. Modeling urban land use conversion of Daqing City, China: a comparative analysis of 'top-down' and 'bottom-up' approaches [J] . Wenliang Li, Changshan Wu, Shuying Zang Stochastic environmental research and risk assessment . 2014,第4期

机译：大庆市城市土地利用转换模型：“自上而下”与“自下而上”方法的比较分析
4. An Integrated Top-Down/Bottom-Up Approach To Speaker Diarization [C] . Simon Bozonnet, Nicholas Evans, Corinne Fredouille, Annual conference of the International Speech Communication Association;INTERSPEECH 2010 . 2011

机译：集成的自上而下/自下而上的方法来实现扬声器音质化
5. What happens in the first 200 ms of word reading: ERP studies on visual word recognition with top-down and bottom-up approaches. [D] . Zheng, Xin. 2008

机译：在单词阅读的前200毫秒中会发生什么：ERP使用自上而下和自下而上的方法对视觉单词识别进行研究。
6. Bottom-up and top-down approaches to understanding oppositional defiant disorder symptoms during early childhood: a mixed method study [O] . Britt-Marie Ljungström, Elisabeth Kenne Sarenmalm, Ulf Axberg 2020

机译：理解幼儿期间理解对立缺陷症症状的自下而上和自上而下的方法：混合方法研究
7. A comparative study of bottom-up and topdown approaches to speaker diarization [O] . Nicholas Evans, Simon Bozonnet, Student Member, 2012

机译：自下而上和自上而下的说话人日记化方法的比较研究

A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization

摘要

著录项

相似文献

相关主题

期刊订阅