首页> 外文会议>European Signal Processing Conference >Sparse time-frequency representations in audio processing, as studied through a symmetrized lognormal model
【24h】

Sparse time-frequency representations in audio processing, as studied through a symmetrized lognormal model

机译:通过对称对数正态模型研究的音频处理中的稀疏时频表示

获取原文

摘要

Time-frequency representations are ubiquitous in speech and audio signal processing, their use being motivated by both auditory physiology and the mathematics of Fourier analysis. Nonpara-metric statistical models (or equivalently transform based signal processing methods) formulated in this space provide a principled way to decompose sounds into their constituent parts, as well as an effective means of exploiting the local correlation present in the time-frequency structure of naturally generated acoustic signals. Here we describe how an appropriate generative statistical model, even under very simple assumptions, provides a means of exploring sparse time-frequency representations in audio. We introduce a symmetrized lognormal model for spectral coefficients, which shows good agreement across a broad range of speech samples taken from the TIMIT database, and demonstrate preliminary speech enhancement results based on a maximum a posteriori shrinkage estimator.
机译:时频表示在语音和音频信号处理中无处不在,其使用受到听觉生理和傅立叶分析的数学的激励。在此空间中制定的非参数统计模型(或等效的基于变换的信号处理方法)提供了一种将声音分解为其组成部分的原理方法,并且是一种利用自然时频结构中存在的局部相关性的有效手段产生的声音信号。在这里,我们描述了即使在非常简单的假设下,合适的生成统计模型也如何为探索音频中的稀疏时频表示提供了一种方法。我们针对频谱系数引入对称对数正态模型,该模型在从TIMIT数据库获取的广泛语音样本中显示出良好的一致性,并展示了基于最大后验收缩估计量的初步语音增强结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号