Robust statistical processing of TDOA estimates for distant speaker diarization

机译：遥远扬声器日益改复的TDOA估计的强大统计处理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker diarization systems aim to segment an audio signal into homogeneous sections with only one active speaker and answer the question "who spoke when?" We present a novel approach to speaker diarization exploiting spatial information through robust statistical modeling of Time Difference of Arrival (TDOA) estimates obtained using pairs of microphones. The TDOAs are modeled with Gaussian Mixture Models (GMM) trained in a robust manner with the expectation-conditional maximization algorithm and minorization-maximization approach. In situations of multiple microphone deployment, our method allows for the selection of the best microphone pair as part of the modeling and supports ad-hoc microphone placement. Such information can be useful for subsequent speech processing algorithms. We show that our method, which uses only spatial information, achieves up to 36.1% relative reduction in speaker error time compared to an open source toolkit using TDOA features and tested on the NIST RT05 multiparty meeting database.

机译：扬声器日益改估系统旨在将音频信号分段为同类部分，只有一个活跃的扬声器，并回答“何时发表讲话的问题”我们通过使用对麦克风对获得的到达时间差（TDOA）估计的稳健统计建模来提出一种新的扬声器日益改估方法。 TDOAS用以稳健的方式培训的高斯混合模型（GMM）建模，以期望条件最大化算法和较小化最大化方法。在多个麦克风部署的情况下，我们的方法允许选择最佳的麦克风对作为建模的一部分，并支持ad-hoc麦克风放置。这些信息对于后续语音处理算法非常有用。我们表明，与使用TDOA功能的开源工具包相比，我们仅使用空间信息的方法，相对达到36.1 ％相对减少扬声器错误时间，并在NIST RT05 MultiParty会议数据库上测试。

著录项

来源
《European Signal Processing Conference》|2017年|667p|共5页
会议地点
作者
Pablo Peso Parada; Dushyant Sharma; Toon van Waterschoot; Patrick A. Naylor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词
Microphones; Feature extraction; Standards; Computational modeling; Robustness; Hidden Markov models; Indexes;

机译：麦克风;特征提取;标准;计算建模;鲁棒性;隐藏的马尔可夫模型;索引;

相似文献

外文文献
中文文献
专利

1. Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM [J] . Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa Speech Communication . 2007,第6期

机译：通过结合特定于说话人的GMM和适用于说话人的HMM，基于位置相关的CMN进行鲁棒的远方说话人识别
2. Wordless Sounds: Robust Speaker Diarization Using Privacy-Preserving Audio Representations [J] . Parthasarathi S. H. K., Bourlard H., Gatica-Perez D. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第1期

机译：无言的声音：使用保护隐私的音频表示实现鲁棒的扬声器分离
3. Harmonic Structure Features for Robust Speaker Diarization [J] . Yu Zhou, Hongbin Suo, Junfeng Li, ETRI journal . 2012,第4期

机译：谐波结构特性可实现强健的扬声器分离
4. Robust statistical processing of TDOA estimates for distant speaker diarization [C] . Pablo Peso Parada, Dushyant Sharma, Toon van Waterschoot, European Signal Processing Conference . 2017

机译：TDOA估计值的稳健统计处理，可实现远距离说话人的二值化
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Supervised Speaker Diarization Using Random Forests: A Tool for Psychotherapy Process Research [O] . Lukas Fürer, Nathalie Schenk, Volker Roth, 2020

机译：使用随机森林监督扬声器日期：一种心理治疗过程研究的工具
7. Robust statistical processing of TDOA estimates for distant speaker diarization [O] . Pablo Peso Parada, Dushyant Sharma, Toon van Waterschoot, 2017

机译：遥远扬声器日益改复的TDOA估计的强大统计处理
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Robust statistical processing of TDOA estimates for distant speaker diarization

摘要

著录项

相似文献

相关主题

期刊订阅