首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >SPFTN: A Joint Learning Framework for Localizing and Segmenting Objects in Weakly Labeled Videos
【24h】

SPFTN: A Joint Learning Framework for Localizing and Segmenting Objects in Weakly Labeled Videos

机译:SPFTN:一种联合学习框架,用于对弱标签视频中的对象进行本地化和分段

获取原文
获取原文并翻译 | 示例

摘要

Object localization and segmentation in weakly labeled videos are two interesting yet challenging tasks. Models built for simultaneous object localization and segmentation have been explored in the conventional fully supervised learning scenario to boost the performance of each task. However, none of the existing works has attempted to jointly learn object localization and segmentation models under weak supervision. To this end, we propose a joint learning framework called Self-Paced Fine-Tuning Network (SPFTN) for localizing and segmenting objects in weakly labelled videos. Learning the deep model jointly for object localization and segmentation under weak supervision is very challenging as the learning process of each single task would face serious ambiguity issue due to the lack of bounding-box or pixel-level supervision. To address this problem, our proposed deep SPFTN model is carefully designed with a novel multi-task self-paced learning objective, which leverages the task-specific prior knowledge and the knowledge that has been already captured to infer the confident training samples for each task. By aggregating the confident knowledge from each single task to mine reliable patterns and learning deep feature representation for both tasks, the proposed learning framework can address the ambiguity issue under weak supervision with simple optimization. Comprehensive experiments on the large-scale YouTube-Objects and DAVIS datasets demonstrate that the proposed approach achieves superior performance when compared with other state-of-the-art methods and the baseline networks/models.
机译:标记较弱的视频中的对象定位和分割是两个有趣但具有挑战性的任务。在常规的完全监督学习方案中,已经探索了用于同时进行对象定位和分割的模型,以提高每个任务的性能。然而,现有的工作都没有试图在弱监督下共同学习对象定位和分割模型。为此,我们提出了一个联合学习框架,称为自定步距微调网络(SPFTN),用于对弱标记视频中的对象进行定位和分段。在弱监督下共同学习深度模型以进行对象定位和分割非常具有挑战性,因为缺少边界框或像素级监督,每个任务的学习过程都会面临严重的歧义问题。为了解决这个问题,我们提出的深度SPFTN模型是经过精心设计的,具有新颖的多任务自定进度学习目标,该目标利用了特定于任务的先验知识和已经捕获的知识来推断每个任务的自信训练样本。通过汇总每个任务的可信知识以挖掘可靠的模式并学习这两个任务的深度特征表示,所提出的学习框架可以通过简单的优化解决弱监督下的歧义问题。在大规模的YouTube对象和DAVIS数据集上进行的综合实验表明,与其他最新方法和基准网络/模型相比,该方法具有更高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号