首页> 外文会议>IEEE International Symposium on Signal Processing and Information Technology >A TCN-based Primary Ambient Extraction in Generating Ambisonics Audio from Panorama Video
【24h】

A TCN-based Primary Ambient Extraction in Generating Ambisonics Audio from Panorama Video

机译:从全景视频生成amisisics音频的基于TCN的主要环境提取

获取原文

摘要

Spatial audio is one of the most essential parts of immersive audio-visual experience such as virtual reality (VR), which reproduces the inherent spatiality of sound and the correspondence of audio-visual experience. Ambisonics is the dominant spatial audio solution due to its flexibility and fidelity. However, the production of Ambisonics audio is difficult for the public because of the requirements of expensive equipments or professional music production ability. In this work, an end-to-end Ambisonics generator for panorama video is proposed. To improve the perception of directional sound, we assume that sound field is composed of a primary sound source and an ambient sound without spatiality, and a Temporal Convolutional Network (TCN) based Primary Ambient Extractor (PAE) is proposed to separate the two parts of sound field. The directional sound is spatially encoded by the weights from audio-visual fusion network added by ambient part. Our network is evaluated with panorama video clips with first order Ambisonics. The results show that the proposed approach outperforms other methods in terms of objective evaluations.
机译:空间音频是沉浸式视听体验的最重要部分之一,例如虚拟现实(VR),其可再现声音视觉体验的固有空间和对应的对应性。 amisisisics是由于其灵活性和忠诚度的主导空间音响解决方案。然而,由于昂贵的设备或专业音乐生产能力的要求,公众的扩展音频的生产很困难。在这项工作中,提出了一个全景视频的端到端野外野生动物发生器。为了提高定向声音的感知,假设声场由主要声源和环境声音组成,没有空间性,并且提出了一个基于时间的初级环境提取器(PAE)以分离两部分声场。方向声音通过环境部分添加的视听融合网络的权重空间上编码。我们的网络与全景视频剪辑评估,具有一阶amisisonics。结果表明,在客观评价方面,该方法占据了其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号