首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Scalable Multiband Binaural Renderer for MPEG-H 3D Audio
【24h】

Scalable Multiband Binaural Renderer for MPEG-H 3D Audio

机译:MPEG-H 3D音频的可伸缩多频带双耳渲染器

获取原文
获取原文并翻译 | 示例

摘要

To provide immersive 3D multimedia service, MPEG has launched MPEG-H, ISO/IEC 23008, “High Efficiency Coding and Media Delivery in Heterogeneous Environments.” As part of the audio, MPEG-H 3D Audio has been standardized based on a multichannel loudspeaker configuration (e.g., 22.2). Binaural rendering is a key application of 3D audio; however, previous studies focus on binaural rendering with low complexity such as IIR filter design for HRTF or pre-/post-processing to solve in-head localization or front-back confusion. In this paper, a new binaural rendering algorithm is proposed to support the large number of input channel signals and provide high-quality in terms of timbre, parts of this algorithm were adopted into the MPEG-H 3D Audio. The proposed algorithm truncates binaural room impulse response at mixing time, the transition point from the early-reflections to the late reverberation part. Each part is processed independently by variable order filtering in frequency domain (VOFF) and parametric late reverberation filtering (PLF), respectively. Further, a QMF domain tapped delay line (QTDL) is proposed to reduce complexity in the high-frequency band, based on human auditory perception and codec characteristics. In the proposed algorithm, a scalability scheme is adopted to cover a wide range of applications by adjusting the threshold of mixing time. Experimental results show that the proposed algorithm is able to provide the audio quality of a binaural rendered signal using full-length binaural room impulse responses. A scalability test also shows that the proposed scalability scheme smoothly compromises between audio quality and computational complexity.
机译:为了提供沉浸式3D多媒体服务,MPEG推出了MPEG-H,ISO / IEC 23008,“异构环境中的高效编码和媒体传递”。作为音频的一部分,已经基于多声道扬声器配置(例如22.2)对MPEG-H 3D音频进行了标准化。双耳渲染是3D音频的关键应用。然而,以前的研究集中在低复杂度的双耳渲染上,例如用于HRTF的IIR滤波器设计或用于解决头内定位或前后混淆的预处理。本文提出了一种新的双耳渲染算法,以支持大量的输入通道信号并提供高质量的音色,该算法的部分内容被用于MPEG-H 3D音频。所提出的算法在混合时间即从早期反射到后期混响部分的过渡点处截断了双耳室的脉冲响应。每个部分分别通过频域(VOFF)和参数后期混响滤波(PLF)中的可变阶滤波独立处理。此外,基于人的听觉感知和编解码器特性,提出了QMF域抽头延迟线(QTDL)以降低高频带中的复杂性。在所提出的算法中,通过调整混合时间的阈值,采用可扩展方案来覆盖广泛的应用。实验结果表明,该算法能够利用全长双耳室冲激响应提供双耳渲染信号的音频质量。可伸缩性测试还表明,所提出的可伸缩性方案在音频质量和计算复杂性之间平稳地折衷。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号