首页> 外文期刊>Computer speech and language >ALTIS: A new algorithm for adaptive long-term SNR estimation in multi-talker babble
【24h】

ALTIS: A new algorithm for adaptive long-term SNR estimation in multi-talker babble

机译:ALTIS:一种新的自适应算法,用于在多说话者讲话中进行长期信噪比估计

获取原文
获取原文并翻译 | 示例

摘要

We introduce a real-time capable algorithm which estimates the long-term signal to noise ratio (SNR) of the speech in multi-talker babble noise. In real-time applications, long-term SNR is calculated over a sufficiently long moving frame of the noisy speech ending at the current time. The algorithm performs the real-time long-term SNR estimation by averaging "speech-likeness" values of multiple consecutive short-frames of the noisy speechwhich collectively form a long-frame with an adaptive length. The algorithm is calibrated to be insensitive to short-term fluctuations and transient changes in speech or noise level. However, it quickly responds to non-transient changes in long-term SNR by adjusting the duration of the long-frame on which the long-term SNR is measured. This ability is obtained by employing an event detector and adaptive frame duration. The event detector identifies non-transient changes of the long-term SNR and optimizes the duration of the long-frame accordingly. The algorithm was trained and tested for randomly generated speech samples corrupted with multi-talker babble. In addition to its ability to provide an adaptive long-term SNR estimation in a dynamic noisy situation, the evaluation results show that the algorithm outperforms the existing overall SNR estimation methods in multi-talker babble over a wide range of number of talkers and SNRs. The relatively low computational cost and the ability to update the estimated long-term SNR several times per second make this algorithm capable of operating in real-time speech processing applications. (C) 2019 Elsevier Ltd. All rights reserved.
机译:我们介绍了一种实时算法,该算法可估算多讲话者ba语噪声中语音的长期信噪比(SNR)。在实时应用中,在当前时间结束的嘈杂语音的足够长的移动帧上计算长期SNR。该算法通过对嘈杂语音的多个连续短帧的“语音相似度”值取平均,从而实时地进行长期SNR估计,这些连续短帧共同形成了具有自适应长度的长帧。对该算法进行了校准,使其对语音或噪声水平的短期波动和瞬态变化不敏感。但是,它可以通过调整测量长期SNR的长帧的持续时间,快速响应长期SNR的非瞬态变化。通过使用事件检测器和自适应帧持续时间可以获得此功能。事件检测器识别长期SNR的非瞬态变化,并相应地优化长帧的持续时间。对算法进行了训练和测试,以发现随机产生的语音样本被多方讲话者的胡言乱语所破坏。评估结果表明,该算法除了能够在动态嘈杂的情况下提供自适应的长期SNR估计之外,还具有在多个对话者和SNR范围内的多对话者语音中优于现有的整体SNR估计方法的能力。相对较低的计算成本和每秒更新几次估计的长期SNR的能力使该算法能够在实时语音处理应用程序中运行。 (C)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号