Depth Inference and Visual Saliency Detection from 2D Images.

机译：从2D图像进行深度推断和视觉显着性检测。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the rapid development of 3D vision technology, it is an active research topic to recover the depth information from 2D images. Current solutions heavily depend on the structure assumption of the 2D image and their applications are limited. It is now still technically challenging to develop an efficient yet general solution to generate the depth map from a single image. Furthermore, psychological study indicates that human eyes are particular sensitive to salient object region within one image. Thus, it is critical to detect salient object accurately, and segment its boundary very well as small depth error in these areas will lead to intolerant visual distortion. Briefly speaking, research works in this literature can be categorized into two different categories. Depth map inference system design and salient object detection and segmentation algorithm development.;For depth map inference system design, we propose a novel depth inference system for 2D images and videos. Specifically, we first adopt the in-focus region detection and salient map computation techniques to separate the foreground objects from the remaining background region. After that, a color-based grab-cut algorithm is used to remove the background from obtained foreground objects by modeling the background. As a result, the depth map of the background can be generated by a modified vanishing point detection method. Then, key frame depth maps can be propagated to the remaining frames. Finally, to meet the stringent requirements of VLSI chip implementation such as limited on-chip memory size and real-time processing, we modify some building modules with simplified versions of the in-focus region detection and the mean-shift algorithm. Experimental result shows that the proposed solution can provide accurate depth maps for 83% of images while other state-of-the-art methods can only achieve accuracy for 34% of these test images. This simplified solution targeting at the VLSI chip implementation has been validated for its high accuracy as well as high efficiency on several test video clips.;For salient object detection, inspired by success of late fusion in semantic analysis and multi-modal biometrics, we model saliency detection as late fusion at confidence score level. In fact, we proposed to fuse state-of-the-arts saliency models at score level in a para-boosting learning fashion. Firstly, saliency maps generated from these models are used as confidence scores. Then, these scores are fed into our para-boosting learner (i.e. Support Vector Machine (SVM), Adaptive Boosting (AdBoost), or Probability Density Estimator (PDE)) to predict the final saliency map. In order to explore strength of para-boosting learners, traditional transformation based fusion strategies such as Sum, Min, Max are also applied for comparison purpose. In our application scenario, salient object segmentation is our final goal. So, we further propose a novel salient object segmentation schema using Conditional Random Field (CRF) graph model. In this segmentation model, we first extract local low level features, such as output maps of several saliency models, gradient histogram and position of each image pixel. We then train a random forest classifier to fuse saliency maps into a single high level feature map using ground-truth annotations. Finally, Both low- and high-level features are fed into our CRF and parameters are learned. The segmentation results are evaluated from two different perspectives: region and contour accuracy. Extensive experimental comparison shows that both our salient object detection and segmentation model outperforms the ground truth labeled by human eyes. State-of-the-art saliency models and are, so far, the closest to human eyes' performance.

机译：随着3D视觉技术的飞速发展，从2D图像中恢复深度信息已成为研究的热点。当前的解决方案在很大程度上取决于2D图像的结构假设，其应用受到限制。现在，开发一种有效而通用的解决方案以从单个图像生成深度图在技术上仍然具有挑战性。此外，心理学研究表明，人眼对一张图像中的显着物体区域特别敏感。因此，至关重要的是准确地检测出显着物体，并很好地分割其边界，因为这些区域中的小深度误差将导致不可接受的视觉失真。简而言之，该文献中的研究工作可以分为两类。深度图推理系统设计和显着目标检测与分割算法开发。；对于深度图推理系统设计，我们提出了一种新颖的用于2D图像和视频的深度推理系统。具体来说，我们首先采用对焦区域检测和显着图计算技术将前景物体与其余背景区域分开。之后，使用基于颜色的抓剪算法通过对背景建模来从获得的前景对象中删除背景。结果，可以通过改进的消失点检测方法来生成背景的深度图。然后，关键帧深度图可以传播到其余帧。最后，为了满足VLSI芯片实施的严格要求，例如有限的片上存储器大小和实时处理，我们用聚焦区域检测和均值漂移算法的简化版本修改了一些构建模块。实验结果表明，所提出的解决方案可以为83％的图像提供准确的深度图，而其他最新方法只能为34％的这些测试图像提供准确的深度图。这种针对VLSI芯片实现的简化解决方案已经在多个测试视频剪辑上获得了高准确性和高效率的验证。;对于显着物体检测，受语义分析和多模式生物识别技术后期融合成功的启发，我们对模型进行了建模显着性检测为置信度得分级别的后期融合。实际上，我们建议以一种提升学习能力的方式在分数级别融合最新的显着性模型。首先，将从这些模型生成的显着性图用作置信度得分。然后，将这些分数输入到我们的辅助提升学习器（即支持向量机（SVM），自适应提升（AdBoost）或概率密度估计器（PDE））中，以预测最终的显着性图。为了探索助推器学习者的力量，还基于传统的基于变换的融合策略（例如Sum，Min，Max）进行比较。在我们的应用场景中，显着的对象分割是我们的最终目标。因此，我们进一步提出了使用条件随机场（CRF）图模型的新型显着对象分割方案。在此分割模型中，我们首先提取局部低层特征，例如几个显着性模型的输出图，梯度直方图和每个图像像素的位置。然后，我们训练一个随机森林分类器，以使用地面真实性注释将显着性图融合为单个高级特征图。最后，将低级和高级功能都输入到我们的CRF中并学习参数。从两个不同的角度评估分割结果：区域和轮廓精度。广泛的实验比较表明，我们的显着目标检测和分割模型均优于人眼标记的地面真实情况。迄今为止，最先进的显着性模型最接近人眼的表现。

著录项

作者
Wang, Jingwei.;
展开▼
作者单位

University of Southern California.;

展开▼
授予单位 University of Southern California.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2013
页码 129 p.
总页数 129
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:40:53

相似文献

外文文献
中文文献
专利

1. Visual saliency detection based on in-depth analysis of sparse representation [J] . Xin Wang, Siqiu Shen, Chen Ning Optical engineering . 2018,第3期

机译：基于稀疏表示的深入分析的视觉显着性检测
2. Underwater salient object detection by combining 2D and 3D visual features [J] . Chen Zhe, Gao Hongmin, Zhang Zhen, Neurocomputing . 2020,第May28期

机译：通过组合2D和3D视觉功能来检测水下的突出物体检测
3. Visual Attention Modeling in Compressed Domain:From Image Saliency Detection to Video Saliency Detection [J] . FANG Yuming, ZHANG Xiaoqiang 中兴通讯技术（英文版） . 2019,第001期

机译：压缩域中的视觉注意力建模：从图像显着性检测到视频显着性检测
4. A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection [C] . Yongri Piao, Zhengkun Rong, Miao Zhang, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2020

机译：A2dele：用于高效RGB-D显着物体检测的自适应和精细深度蒸馏器
5. 3D endoscopy video generated using depth inference: Converting 2D to 3D. [D] . Rao, Swetcha. 2012

机译：使用深度推断生成的3D内窥镜视频：将2D转换为3D。
6. Differential patterns of 2D location versus depth decoding along thevisual hierarchy [O] . Nonie J. Finlayson, Xiaoli Zhang, Julie D. Golomb -1

机译：二维位置的差分模式与沿深度的解码视觉层次
7. Detection of Leptomeningeal Metastasis by Contrast-Enhanced 3D T1-SPACE: Comparison with 2D FLAIR and Contrast-Enhanced 2D T1-Weighted Images. [O] . Bomi Gil, Eo-Jin Hwang, Song Lee, 2016

机译：通过对比增强3D T1-spaCE检测软脑膜转移：与2D FLaIR和对比增强2D T1加权图像的比较。
8. Analysis of an Autostereoscopic Display: The Perceptual Range of the Three Dimensional Visual Fields and Saliency of Static Depth Cues [R] . Havig, P. R. , McIntire, J. P. , McGruder, R. A. 2005

机译：自动立体显示的分析：三维视野的感知范围和静态深度线索的显着性

Depth Inference and Visual Saliency Detection from 2D Images.

摘要

著录项

相似文献

相关主题

期刊订阅