首页> 外文会议>European Signal Processing Conference >Robust statistical processing of TDOA estimates for distant speaker diarization
【24h】

Robust statistical processing of TDOA estimates for distant speaker diarization

机译:遥远扬声器日益改复的TDOA估计的强大统计处理

获取原文

摘要

Speaker diarization systems aim to segment an audio signal into homogeneous sections with only one active speaker and answer the question "who spoke when?" We present a novel approach to speaker diarization exploiting spatial information through robust statistical modeling of Time Difference of Arrival (TDOA) estimates obtained using pairs of microphones. The TDOAs are modeled with Gaussian Mixture Models (GMM) trained in a robust manner with the expectation-conditional maximization algorithm and minorization-maximization approach. In situations of multiple microphone deployment, our method allows for the selection of the best microphone pair as part of the modeling and supports ad-hoc microphone placement. Such information can be useful for subsequent speech processing algorithms. We show that our method, which uses only spatial information, achieves up to 36.1% relative reduction in speaker error time compared to an open source toolkit using TDOA features and tested on the NIST RT05 multiparty meeting database.
机译:扬声器日益改估系统旨在将音频信号分段为同类部分,只有一个活跃的扬声器,并回答“何时发表讲话的问题”我们通过使用对麦克风对获得的到达时间差(TDOA)估计的稳健统计建模来提出一种新的扬声器日益改估方法。 TDOAS用以稳健的方式培训的高斯混合模型(GMM)建模,以期望条件最大化算法和较小化最大化方法。在多个麦克风部署的情况下,我们的方法允许选择最佳的麦克风对作为建模的一部分,并支持ad-hoc麦克风放置。这些信息对于后续语音处理算法非常有用。我们表明,与使用TDOA功能的开源工具包相比,我们仅使用空间信息的方法,相对达到36.1 %相对减少扬声器错误时间,并在NIST RT05 MultiParty会议数据库上测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号