首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics >Robust speaker's location detection in a vehicle environment using GMM models
【24h】

Robust speaker's location detection in a vehicle environment using GMM models

机译:使用GMM模型在车辆环境中进行可靠的扬声器位置检测

获取原文
获取原文并翻译 | 示例

摘要

Human-computer interaction (HCI) using speech communication is becoming increasingly important, especially in driving where safety is the primary concern. Knowing the speaker's location (i.e., speaker localization) not only improves the enhancement results of a corrupted signal, but also provides assistance to speaker identification. Since conventional speech localization algorithms suffer from the uncertainties of environmental complexity and noise, as well as from the microphone mismatch problem, they are frequently not robust in practice. Without a high reliability, the acceptance of speech-based HCI would never be realized. This work presents a novel speaker's location detection method and demonstrates high accuracy within a vehicle cabinet using a single linear microphone array. The proposed approach utilize Gaussian mixture models (GMM) to model the distributions of the phase differences among the microphones caused by the complex characteristic of room acoustic and microphone mismatch. The model can be applied both in near-field and far-field situations in a noisy environment. The individual Gaussian component of a GMM represents some general location-dependent but content and speaker-independent phase difference distributions. Moreover, the scheme performs well not only in nonline-of-sight cases, but also when the speakers are aligned toward the microphone array but at difference distances from it. This strong performance can be achieved by exploiting the fact that the phase difference distributions at different locations are distinguishable in the environment of a car. The experimental results also show that the proposed method outperforms the conventional multiple signal classification method (MUSIC) technique at various SNRs.
机译:使用语音通信的人机交互(HCI)变得越来越重要,特别是在安全是首要考虑因素的驾驶中。知道说话者的位置(即,说话者定位)不仅改善了受损信号的增强结果,而且还为说话者识别提供了帮助。由于常规的语音定位算法遭受环境复杂性和噪声的不确定性以及麦克风失配问题的困扰,因此它们在实践中通常不可靠。没有高可靠性,就不可能实现基于语音的HCI。这项工作提出了一种新颖的扬声器位置检测方法,并展示了使用单个线性麦克风阵列在车厢内的高精度。所提出的方法利用高斯混合模型(GMM)对由室内声学和麦克风失配的复杂特性引起的麦克风之间的相位差分布进行建模。该模型可以在嘈杂环境中的近场和远场情况下应用。 GMM的各个高斯分量代表一些一般的位置相关但内容和说话者无关的相位差分布。此外,该方案不仅在非视线情况下,而且在扬声器朝向麦克风阵列但与麦克风阵列的距离不同时,都表现良好。通过利用在汽车环境中可区分不同位置的相位差分布这一事实,可以实现这种强大的性能。实验结果还表明,该方法在各种信噪比下均优于常规的多信号分类方法(MUSIC)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号