首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing
【24h】

MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing

机译:MDCT正弦分析,用于音频信号分析和处理

获取原文
获取原文并翻译 | 示例
           

摘要

The Modified Discrete Cosine Transform (MDCT) is widely used in audio signals compression, but mostly limited to representing audio signals. This is because the MDCT is a real transform: Phase information is missing and spectral power varies frame to frame even for pure sine waves. We have a key observation concerning the structure of the MDCT spectrum of a sine wave: Across frames, the complete spectrum changes substantially, but if separated into even and odd subspectra, neither changes except scaling. Inspired by this observation, we find that the MDCT spectrum of a sine wave can be represented as an envelope factor times a phase-modulation factor. The first one is shift-invariant and depends only on the sine wave's amplitude and frequency, thus stays constant over frames. The second one has the form of $sintheta$ for all odd bins and $costheta$ for all even bins, leading to subspectra's constant shapes. But this $theta$ depends on the start point of a transform frame, therefore, changes at each new frame, and then changes the whole spectrum. We apply this formulation of the MDCT spectral structure to frequency estimation in the MDCT domain, both for pure sine waves and sine waves with noises. Compared to existing methods, ours are more accurate and more general (not limited to the sine window). We also apply the spectral structure to stereo coding. A pure tone or tone-dominant stereo signal may have very different left and right MDCT spectra, but their subspectra have similar shapes. One ratio for even bins and one ratio for odd bins will be enough to reconstruct the right from the left, saving half bitrate. This scheme is simple and at the same time more efficient than the traditional Intensity Stereo (IS).
机译:改进的离散余弦变换(MDCT)广泛用于音频信号压缩,但主要限于表示音频信号。这是因为MDCT是真正的变换:缺少相位信息,并且即使对于纯正弦波,频谱功率也会逐帧变化。我们对正弦波的MDCT频谱的结构有一个关键的观察:在整个帧中,完整的频谱会发生很大的变化,但是如果分成偶数和奇数子谱,则除了缩放以外,其他任何变化都不会改变。受此观察启发,我们发现正弦波的MDCT频谱可以表示为包络因子乘以相位调制因子。第一个是位移不变的,仅取决于正弦波的幅度和频率,因此在整个帧中保持恒定。第二个形式为所有奇数箱的$ sintheta $和所有偶数箱的$ costheta $的形式,导致子谱的常数形状。但是这个θ取决于转换帧的起点,因此,在每个新帧处都会发生变化,然后改变整个频谱。我们将MDCT频谱结构的这种表示法应用于MDCT域中的频率估计,包括纯正弦波和带有噪声的正弦波。与现有方法相比,我们的方法更准确,更通用(不限于正弦窗口)。我们还将频谱结构应用于立体声编码。纯音调或以音调为主的立体声信号可能具有非常不同的左右MDCT频谱,但是它们的子频谱具有相似的形状。偶数bin的一个比率和奇数bin的一个比率将足以从左边重建右边,节省一半的比特率。与传统的Intensity Stereo(IS)相比,此方案非常简单,同时效率更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号