首页> 外文期刊>Circuits and Systems for Video Technology, IEEE Transactions on >Transductive Video Segmentation on Tree-Structured Model
【24h】

Transductive Video Segmentation on Tree-Structured Model

机译:树结构模型的转导视频分割

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a transductive multicomponent video segmentation algorithm, which is capable of segmenting the predefined object of interest in the frames of a video sequence. To ensure temporal consistency, a temporal coherent parametric min-cut algorithm is developed to generate segmentation hypotheses based on visual cues and motion cues. Furthermore, each hypothesis is evaluated by an energy function from foreground resemblance, foreground/background divergence, boundary strength, and visual saliency. In particular, the state-of-the-art R-convolutional neural network descriptor is leveraged to encode the visual appearance of the foreground object. Finally, the optimal segmentation of the frame can be attained by assembling the segmentation hypotheses through the Monte Carlo approximation. In particular, multiple foreground components are built to capture the variances of the foreground object in shapes and poses. To group the frames into different components, a tree-structured graphical model named temporal tree is designed, where visually similar and temporally coherent frames are arranged in branches. The temporal tree can be constructed by iteratively adding frames to the active nodes by probabilistic clustering. In addition, each component, consisting of frames in the same branch, is characterized by a support vector machine classifier, which is learned in a transductive fashion by jointly maximizing the margin over the labeled frames and the unlabeled frames. As the frames from the same video sequence follow the same distribution, the transductive classifiers achieve stronger generalization capability than inductive ones. Experimental results on the public benchmarks demonstrate the effectiveness of the proposed method in comparison with other state-of-the-art supervised and unsupervised video segmentation methods.
机译:本文提出了一种转导式多分量视频分割算法,该算法能够在视频序列的帧中分割感兴趣的预定义对象。为了确保时间一致性,开发了时间相干参数最小割算法以基于视觉提示和运动提示生成分割假设。此外,每个假设都通过能量函数从前景相似度,前景/背景差异,边界强度和视觉显着性进行评估。特别是,利用最新的R卷积神经网络描述符对前景对象的视觉外观进行编码。最后,可以通过使用蒙特卡洛近似法组合分割假设来获得帧的最佳分割。特别是,构建了多个前景组件以捕获前景对象在形状和姿势上的变化。为了将帧分为不同的组件,设计了一个名为时间树的树状结构图形模型,其中在视觉上相似且在时间上相干的帧排列在分支中。可以通过通过概率聚类将帧迭代添加到活动节点来构造时间树。另外,由相同分支中的帧组成的每个组件的特征在于支持向量机分类器,该支持向量机分类器通过联合最大化标记帧和未标记帧的边距以转换方式学习。由于来自相同视频序列的帧遵循相同的分布,因此与归纳分类器相比,转导分类器具有更强的泛化能力。在公共基准上的实验结果表明,与其他最新的有监督和无监督视频分割方法相比,该方法是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号