...
首页> 外文期刊>IEEE Transactions on Broadcasting >H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System
【24h】

H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System

机译:H2-Stereo:高速、高分辨率立体视频系统

获取原文
获取原文并翻译 | 示例

摘要

High-speed, high-resolution stereoscopic (H2-Stereo) video allows us to perceive dynamic 3D content at fine granularity. The acquisition of H2-Stereo video, however, remains challenging with commodity cameras. Existing spatial super-resolution or temporal frame interpolation methods provide compromised solutions that lack temporal or spatial details, respectively. To alleviate this problem, we propose a dual camera system, in which one camera captures high-spatial-resolution low-frame-rate (HSR-LFR) videos with rich spatial details, and the other captures low-spatial-resolution high-frame-rate (LSR-HFR) videos with smooth temporal details. We then devise a Learned Information Fusion network (LIFnet) that exploits the cross-camera redundancies to enhance both camera views to high spatiotemporal resolution (HSTR) for reconstructing the H2-Stereo video effectively. We utilize a disparity network to transfer spatiotemporal information across views even in large disparity scenes, based on which, we propose disparity-guided flow-based warping for LSR-HFR view and complementary warping for HSR-LFR view. A multi-scale fusion method in feature domain is proposed to minimize occlusion-induced warping ghosts and holes in HSR-LFR view. The LIFnet is trained in an end-to-end manner using our collected high-quality Stereo Video dataset from YouTube. Extensive experiments demonstrate that our model outperforms existing state-of-the-art methods for both views on synthetic data and camera-captured real data with large disparity. Ablation studies explore various aspects, including spatiotemporal resolution, camera baseline, camera desynchronization, long/short exposures and applications, of our system to fully understand its capability for potential applications.
机译:高速、高分辨率立体 (H2-Stereo) 视频使我们能够以精细的粒度感知动态 3D 内容。然而,对于商用相机来说,收购H2-Stereo视频仍然具有挑战性。现有的空间超分辨率或时态帧插值方法分别提供了缺少时态或空间细节的折衷解决方案。为了缓解这个问题,我们提出了一种双摄像头系统,其中一台摄像头拍摄具有丰富空间细节的高空间分辨率低帧率(HSR-LFR)视频,另一台摄像头拍摄具有平滑时间细节的低空间分辨率高帧率(LSR-HFR)视频。然后,我们设计了一个学习信息融合网络(LIFnet),该网络利用跨摄像机冗余将两个摄像机视图增强为高时空分辨率(HSTR),以有效地重建H2立体视频。即使在大视差场景中,我们利用视差网络在视图之间传输时空信息,在此基础上,我们提出了基于视差引导的基于流的LSR-HFR视图变形和HSR-LFR视图的互补变形。该文提出一种特征域多尺度融合方法,以最小化HSR-LFR视图中遮挡引起的翘曲重影和孔洞。LIFnet 使用我们从 YouTube 收集的高质量立体声视频数据集以端到端的方式进行训练。大量的实验表明,我们的模型在合成数据视图和相机捕获的真实数据方面都优于现有的最先进的方法,但存在很大差异。消融研究探讨了我们系统的各个方面,包括时空分辨率、相机基线、相机不同步、长/短曝光和应用,以充分了解其潜在应用的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号