首页> 外文会议>IEEE International Symposium on Signal Processing and Information Technology >A TCN-based Primary Ambient Extraction in Generating Ambisonics Audio from Panorama Video

【24h】

A TCN-based Primary Ambient Extraction in Generating Ambisonics Audio from Panorama Video

机译：从全景视频生成amisisics音频的基于TCN的主要环境提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spatial audio is one of the most essential parts of immersive audio-visual experience such as virtual reality (VR), which reproduces the inherent spatiality of sound and the correspondence of audio-visual experience. Ambisonics is the dominant spatial audio solution due to its flexibility and fidelity. However, the production of Ambisonics audio is difficult for the public because of the requirements of expensive equipments or professional music production ability. In this work, an end-to-end Ambisonics generator for panorama video is proposed. To improve the perception of directional sound, we assume that sound field is composed of a primary sound source and an ambient sound without spatiality, and a Temporal Convolutional Network (TCN) based Primary Ambient Extractor (PAE) is proposed to separate the two parts of sound field. The directional sound is spatially encoded by the weights from audio-visual fusion network added by ambient part. Our network is evaluated with panorama video clips with first order Ambisonics. The results show that the proposed approach outperforms other methods in terms of objective evaluations.

机译：空间音频是沉浸式视听体验的最重要部分之一，例如虚拟现实（VR），其可再现声音视觉体验的固有空间和对应的对应性。 amisisisics是由于其灵活性和忠诚度的主导空间音响解决方案。然而，由于昂贵的设备或专业音乐生产能力的要求，公众的扩展音频的生产很困难。在这项工作中，提出了一个全景视频的端到端野外野生动物发生器。为了提高定向声音的感知，假设声场由主要声源和环境声音组成，没有空间性，并且提出了一个基于时间的初级环境提取器（PAE）以分离两部分声场。方向声音通过环境部分添加的视听融合网络的权重空间上编码。我们的网络与全景视频剪辑评估，具有一阶amisisonics。结果表明，在客观评价方面，该方法占据了其他方法。

著录项

来源
《IEEE International Symposium on Signal Processing and Information Technology 》|2020年|1-6|共6页
会议地点
作者
Zhuliang Lv; Yi Zhou; Hongqing Liu; Xiaofeng Shu; Nannan Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Measurement; Visualization; Solid modeling; Production; Virtual reality; Generators;

机译：培训;测量;可视化;实体建模;生产;虚拟现实;发电机;

相似文献

外文文献
中文文献
专利

1. Primary-Ambient Extraction Using Ambient Spectrum Estimation for Immersive Spatial Audio Reproduction [J] . He Jianjun, Gan Woon-Seng, Tan Ee-Leng Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015 ,第9期

机译：使用环境频谱估计进行原始环境提取，以沉浸式空间音频再现
2. Spatial Audio Reproduction with Primary Ambient Extraction [J] . Jon W. Mooney Noise-News International . 2017 ,第3期

机译：具有初级环境提取的空间音频再现
3. Time-Shifting Based Primary-Ambient Extraction for Spatial Audio Reproduction [J] . He Jianjun, Gan Woon-Seng, Tan Ee-Leng Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015 ,第10期

机译：基于时移的空间音频再现主环境提取
4. Primary-Ambient Extraction Based on Channel Pair for 5.1 Channel Audio Using Least Square [C] . Dingyan Song, Ge Gao, Yi Chen, Pacific-Rim conference on multimedia . 2018

机译：基于通道对的最小二乘提取5.1声道音频的主环境
5. Selected factors affecting attitudes of graduate faculty toward use of two-way audio/two-way video as a primary instructional delivery system. [D] . Poe, Mary Elizabeth Carroll. 2000

机译：影响研究生教师对使用双向音频/双向视频作为主要教学系统的态度的部分因素。
6. Activity Recognition for Ambient Assisted Living with Videos Inertial Units and Ambient Sensors [O] . Caetano Mazzoni Ranieri, Scott MacLeod, Mauro Dragone, 2021

机译：活动识别与视频惯性单元和环境传感器辅助的环境辅助
7. Generating Sharp Panoramas from Motion-blurred Videos [O] . Yunpeng Li, Sing Bing, Kang Neel, 2010

机译：从运动模糊的视频生成清晰的全景图

A TCN-based Primary Ambient Extraction in Generating Ambisonics Audio from Panorama Video

摘要

著录项

相似文献

相关主题

期刊订阅