首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Convex Combination of Multiple Statistical Models With Application to VAD
【24h】

Convex Combination of Multiple Statistical Models With Application to VAD

机译:多种统计模型的凸组合及其在VAD中的应用

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a robust voice activity detector (VAD) based on the observation that the distribution of speech captured with far-field microphones is highly varying, depending on the noise and reverberation conditions. The proposed VAD employs a convex combination scheme comprising three statistical distributions—a Gaussian, a Laplacian, and a two-sided Gamma—to effectively model captured speech. This scheme shows increased ability to adapt to dynamic acoustic environments. The contribution of each distribution to this convex combination is automatically adjusted based on the statistical characteristics of the instantaneous audio input. To further improve the performance of the system, an adaptive threshold is introduced, while a decision-smoothing scheme caters to the intra-frame correlation of speech signals. Extensive experiments under realistic scenarios support the proposed approach of combining several models for increased adaptation and performance.
机译:本文提出了一种健壮的语音活动检测器(VAD),它基于以下事实:远场麦克风捕获的语音分布随噪声和混响条件而变化很大。拟议的VAD采用了包含三个统计分布(高斯,拉普拉斯和两侧伽玛)的凸组合方案,可以有效地对捕获的语音进行建模。该方案显示出增加的适应动态声学环境的能力。根据瞬时音频输入的统计特性,可以自动调整每个分布对该凸组合的贡献。为了进一步改善系统性能,引入了自适应阈值,而决策平滑方案则可满足语音信号的帧内相关性。在现实情况下的大量实验支持所提出的方法,该方法结合了多个模型以提高适应性和性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号