首页> 外文会议>European Signal Processing Conference >Blind spatial sound source clustering and activity detection using uncalibrated microphone array
【24h】

Blind spatial sound source clustering and activity detection using uncalibrated microphone array

机译:使用未校准的麦克风阵列进行盲空间声源聚类和活动检测

获取原文
获取外文期刊封面目录资料

摘要

This paper presents a method for estimating the number, as well as the activity periods of spatially distributed sound sources using an uncalibrated microphone array. This methodology is applied for the purposes of speaker diarization. In general, speaker diarization has difficulty with: 1) estimating the number of sound sources (speakers), and 2) activity detection of multiple sound sources including overlap of utterances. Several microphone array based techniques have already tackled these challenges. However, existing methods mainly assume that the steering vectors for the microphone array are calibrated in advance to identify sound sources, which is difficult to satisfy when ad-hoc or flexible microphone arrays are used. Thus our approach estimates the number of sound sources blindly in two steps. First, Time Delay of Arrival (TDOA) of the observed signal is clustered. Second, the sound source activity is detected by clustering the long-term spatial spectrum using the TDOA based steering vector for each cluster. The validity of the algorithm is confirmed by both synthesized signals and a real-world flexible microphone array application.
机译:本文提出了一种使用未校准的麦克风阵列估计数量以及空间分布声源的活动时间的方法。此方法适用于说话人二分法的目的。通常,说话人区分存在以下困难:1)估计声源(说话者)的数量,以及2)多个声源的活动检测,包括话语重叠。几种基于麦克风阵列的技术已经解决了这些挑战。然而,现有方法主要假设用于麦克风阵列的转向矢量被预先校准以识别声源,这在使用临时或柔性麦克风阵列时难以满足。因此,我们的方法分两步盲目估算声源的数量。首先,将观测信号的到达时延(TDOA)进行聚类。其次,通过为每个群集使用基于TDOA的导向向量对长期空间频谱进行群集,来检测声源活动。该算法的有效性由合成信号和现实世界的灵活麦克风阵列应用程序共同证实。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号