首页> 外文期刊>IEEE Transactions on Image Processing >Learning Event Representations for Temporal Segmentation of Image Sequences by Dynamic Graph Embedding
【24h】

Learning Event Representations for Temporal Segmentation of Image Sequences by Dynamic Graph Embedding

机译:动态图嵌入通过动态图形嵌入图像序列的时间分割事件表示

获取原文
获取原文并翻译 | 示例

摘要

Recently, self-supervised learning has proved to be effective to learn representations of events suitable for temporal segmentation in image sequences, where events are understood as sets of temporally adjacent images that are semantically perceived as a whole. However, although this approach does not require expensive manual annotations, it is data hungry and suffers from domain adaptation problems. As an alternative, in this work, we propose a novel approach for learning event representations named Dynamic Graph Embedding (DGE). The assumption underlying our model is that a sequence of images can be represented by a graph that encodes both semantic and temporal similarity. The key novelty of DGE is to learn jointly the graph and its graph embedding. At its core, DGE works by iterating over two steps: 1) updating the graph representing the semantic and temporal similarity of the data based on the current data representation, and 2) updating the data representation to take into account the current data graph structure. The main advantage of DGE over state-of-the-art self-supervised approaches is that it does not require any training set, but instead learns iteratively from the data itself a low-dimensional embedding that reflects their temporal and semantic similarity. Experimental results on two benchmark datasets of real image sequences captured at regular time intervals demonstrate that the proposed DGE leads to event representations effective for temporal segmentation. In particular, it achieves robust temporal segmentation on the EDUBSeg and EDUBSeg-Desc benchmark datasets, outperforming the state of the art. Additional experiments on two Human Motion Segmentation benchmark datasets demonstrate the generalization capabilities of the proposed DGE.
机译:最近,已证明自我监督的学习是有效地学习<斜体XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3的表示有效.org / 1999 / xlink“>事件适用于图像序列中的时间分段,其中事件被理解为逐时被视为整体的时间上相邻的图像集。但是,虽然这种方法不需要昂贵的手动注释,但它是饥饿的数据,并且遭受域适应问题。作为替代方案,在这项工作中,我们提出了一种新颖的学习事件陈述方法,名为(DGE)。我们模型的底层的假设是可以通过编码语义和时间相似性的图表来表示一系列图像。 DGE的关键新颖性是共同学习图形及其图形嵌入。在其核心,DGE通过迭代两个步骤:1)根据当前数据表示,更新表示数据的语义和时间相似性的图表,以及2)更新数据表示以考虑当前数据图结构。 DGE的主要优势在最先进的自我监督方法中是它不需要任何培训集,而是从数据本身迭代地学习,这是反映其时间和语义相似性的低维嵌入。以规则间隔捕获的真实图像序列的两个基准数据集的实验结果表明,所提出的DGE导致事件表示有效的时间分割。特别是,它在EDUBSEG和EDUBSEG-DESC基准数据集上实现了强大的时间分段,优于现有技术。两个人类运动分段基准数据集的附加实验证明了所提出的DGE的泛化能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号