首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Speech Activity Detection for Multi-Party Conversation Analyses Based on Likelihood Ratio Test on Spatial Magnitude
【24h】

Speech Activity Detection for Multi-Party Conversation Analyses Based on Likelihood Ratio Test on Spatial Magnitude

机译:基于空间幅度似然比检验的多方会话语音活动检测分析

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a microphone array-based speech activity detection (SAD) method for analyzing multi-party conversations recorded in the presence of noise. In particular, the proposed method considers conversations where the number of speakers and speaker locations cannot be restricted, such as when standing and talking, and at poster sessions. When we observe such conversations, there are directional noise sources and diffuse noise that affect the direction of arrival estimations of the target speech signals. To detect speech activity without a priori knowledge about the speakers and noise environments, a likelihood ratio test (LRT)-based SAD method is applied to spatial magnitude, which are estimated by using the time-frequency masking of the observed spectra. The proposed method can exploit the enhanced signals obtained from time-frequency masking, and works even in the presence of environmental noise. Experiments with recorded simulated poster sessions confirmed that the proposed method could outperform conventional methods based on the LRT for a single channel, magnitude coherence, or crosspower spectrum phase.
机译:本文提出了一种基于麦克风阵列的语音活动检测(SAD)方法,用于分析在有噪声情况下录制的多方对话。特别地,所提出的方法考虑了不能限制说话者和说话者位置的数量的对话,例如当站立和讲话时以及在发帖者会话时。当我们观察到这样的对话时,会有定向噪声源和扩散噪声影响目标语音信号到达估计的方向。为了在没有先验知识的说话者和噪声环境的情况下检测语音活动,将基于似然比测试(LRT)的SAD方法应用于空间大小,该空间大小是通过使用观察到的频谱的时频掩蔽来估计的。所提出的方法可以利用从时频掩蔽获得的增强信号,并且即使在存在环境噪声的情况下也可以工作。记录的模拟张贴者会议的实验证实,对于单通道,幅度相干或互功率谱相位,该方法可以优于基于LRT的传统方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号