MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing

Zhang S.; Dou W.; Yang H.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing

【24h】

MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing

机译：MDCT正弦分析，用于音频信号分析和处理

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Modified Discrete Cosine Transform (MDCT) is widely used in audio signals compression, but mostly limited to representing audio signals. This is because the MDCT is a real transform: Phase information is missing and spectral power varies frame to frame even for pure sine waves. We have a key observation concerning the structure of the MDCT spectrum of a sine wave: Across frames, the complete spectrum changes substantially, but if separated into even and odd subspectra, neither changes except scaling. Inspired by this observation, we find that the MDCT spectrum of a sine wave can be represented as an envelope factor times a phase-modulation factor. The first one is shift-invariant and depends only on the sine wave's amplitude and frequency, thus stays constant over frames. The second one has the form of $sintheta$ for all odd bins and $costheta$ for all even bins, leading to subspectra's constant shapes. But this $theta$ depends on the start point of a transform frame, therefore, changes at each new frame, and then changes the whole spectrum. We apply this formulation of the MDCT spectral structure to frequency estimation in the MDCT domain, both for pure sine waves and sine waves with noises. Compared to existing methods, ours are more accurate and more general (not limited to the sine window). We also apply the spectral structure to stereo coding. A pure tone or tone-dominant stereo signal may have very different left and right MDCT spectra, but their subspectra have similar shapes. One ratio for even bins and one ratio for odd bins will be enough to reconstruct the right from the left, saving half bitrate. This scheme is simple and at the same time more efficient than the traditional Intensity Stereo (IS).

机译：改进的离散余弦变换（MDCT）广泛用于音频信号压缩，但主要限于表示音频信号。这是因为MDCT是真正的变换：缺少相位信息，并且即使对于纯正弦波，频谱功率也会逐帧变化。我们对正弦波的MDCT频谱的结构有一个关键的观察：在整个帧中，完整的频谱会发生很大的变化，但是如果分成偶数和奇数子谱，则除了缩放以外，其他任何变化都不会改变。受此观察启发，我们发现正弦波的MDCT频谱可以表示为包络因子乘以相位调制因子。第一个是位移不变的，仅取决于正弦波的幅度和频率，因此在整个帧中保持恒定。第二个形式为所有奇数箱的$ sintheta $和所有偶数箱的$ costheta $的形式，导致子谱的常数形状。但是这个θ取决于转换帧的起点，因此，在每个新帧处都会发生变化，然后改变整个频谱。我们将MDCT频谱结构的这种表示法应用于MDCT域中的频率估计，包括纯正弦波和带有噪声的正弦波。与现有方法相比，我们的方法更准确，更通用（不限于正弦窗口）。我们还将频谱结构应用于立体声编码。纯音调或以音调为主的立体声信号可能具有非常不同的左右MDCT频谱，但是它们的子频谱具有相似的形状。偶数bin的一个比率和奇数bin的一个比率将足以从左边重建右边，节省一半的比特率。与传统的Intensity Stereo（IS）相比，此方案非常简单，同时效率更高。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2013年第7期|1403-1414|共12页
作者
Zhang S.; Dou W.; Yang H.;
展开▼
作者单位

Tsinghua National Laboratory for Information Science and Technology, the Department of Electronic Engineering, Tsinghua University, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Discrete Fourier transforms; Educational institutions; Periodic structures; Shape; Signal analysis; Time frequency analysis; Frequency estimation; MDCT; pseudo-magnitude; pseudo-phase; window function;

机译：离散傅立叶变换;教育机构;周期性结构;形状;信号分析;时间频率分析;频率估计;MDCT;准幅度假相窗口功能;

相似文献

外文文献
中文文献
专利

1. Audio Signal Performance Analysis using Integer MDCT Algorithm [J] . M. Davidson Kamala Dhas, R. Priyadharsini International Journal of Applied Engineering Research . 2019,第1aPta1期

机译：使用整数MDCT算法进行音频信号性能分析
2. Streaming audio packet loss concealment based on sinusoidal frequency estimation in MDCT domain [J] . Meng-Yao Zhu, Ming Zhang, Xiao-Qing Yu, Consumer Electronics, IEEE Transactions on . 2010,第2期

机译：基于MDCT域中正弦频率估计的流音频丢包隐藏
3. An Analysis of Animated Character Dubbing as Voice Acting Using Audio and Video Signal Processing [J] . Jin-Yeong Lee, Seong-Geon Bae, Myung-Jin Bae International Journal of Applied Engineering Research . 2017,第15aPta4期

机译：使用音频和视频信号处理称为语音作用的动画字符分析
4. 'Sparsification' of Audio Signals using the MDCT/lntMDCT and a Psychoacoustic Model - Application to Informed Audio Source Separation [C] . Jonathan Pinel, Laurent Girin AES international conference . 2011

机译：使用MDCT / lntMDCT和心理声学模型对音频信号进行“稀疏化”-在信息音频源分离中的应用
5. High-resolution sinusoidal analysis for resolving harmonic collisions in music audio signal processing. [D] . Ehmann, Andreas F. 2011

机译：高分辨率正弦分析，用于解决音乐音频信号处理中的谐波冲突。
6. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis [O] . Theodoros Giannakopoulos 2010

机译：pyAudioAnalysis：用于音频信号分析的开源Python库
7. High-resolution sinusoidal analysis for resolving harmonic collisions in music audio signal processing [O] . Ehmann Andreas 2011

机译：用于解决音乐音频信号处理中的谐波碰撞的高分辨率正弦分析
8. Performance Analysis of the Adaptive Line Enhancer for Sinusoidal Signals in Broad-Band Noise [R] . Yoganandam, Y., Reddy, V. U., Kailath, T. 1988

机译：宽带噪声中正弦信号自适应线增强器的性能分析

MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅