Discriminative Unsupervised Alignment of Natural Language Instructions with Corresponding Video Segments

机译：自然语言指令与相应视频段的可区分无监督对齐

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the problem of automatically aligning natural language sentences with corresponding video segments without any direct supervision. Most existing algorithms for integrating language with videos rely on hand-aligned parallel data, where each natural language sentence is manually aligned with its corresponding image or video segment. Recently, fully unsupervised alignment of text with video has been shown to be feasible using hierarchical generative models. In contrast to the previous generative models, we propose three latent-variable discriminative models for the unsupervised alignment task. The proposed discriminative models are capable of incorporating domain knowledge, by adding diverse and overlapping features. The results show that discriminative models outperform the generative models in terms of alignment accuracy.

机译：我们解决了在没有任何直接监督的情况下自动将自然语言句子与相应的视频片段对齐的问题。现有的大多数将语言与视频集成的算法都依赖于手动对齐的并行数据，其中每个自然语言句子都与相应的图像或视频段手动对齐。最近，已经证明，使用分层生成模型，完全无监督地将文本与视频对齐是可行的。与以前的生成模型相比，我们针对无监督对齐任务提出了三个潜在变量判别模型。所提出的判别模型能够通过添加多样化和重叠的功能来整合领域知识。结果表明，在对齐精度方面，判别模型优于生成模型。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2015年|164-174|共11页
会议地点
作者
Iftekhar Naim; Young Chol Song; Qiguang Liu; Liang Huang; Henry Kautz; Jiebo Luo; Daniel Gildea;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. VideoWhisper: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks [J] . Na Zhao, Hanwang Zhang, Richang Hong, IEEE transactions on multimedia . 2017,第9期

机译：VideoWhisper：通过基于注意力的递归神经网络实现区分性无监督视频特征学习
2. Kernel alignment unsupervised discriminative dimensionality reduction [J] . Gao Yunlong, Luo Sizhe, Pan Jinyan, Neurocomputing . 2021,第Sepa17期

机译：内核对准无监督判别维度减少
3. Discriminative Subspace Alignment for Unsupervised Visual Domain Adaptation [J] . Sun Hao, Liu Shuai, Zhou Shilin Neural processing letters . 2016,第3期

机译：无监督视觉域自适应的判别子空间对齐
4. Discriminative Unsupervised Alignment of Natural Language Instructions with Corresponding Video Segments [C] . Iftekhar Naim, Young Chol Song, Qiguang Liu, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2015

机译：具有相应视频段的自然语言指令的判别无监督对齐
5. Unsupervised Alignment of Natural Language with Video. [D] . Naim, Iftekhar. 2015

机译：自然语言与视频的无监督对齐。
6. Unsupervised learning of natural languages [O] . Zach Solan, David Horn, Eytan Ruppin, 2005

机译：无监督学习自然语言
7. Discriminative Unsupervised Alignment of Natural Language Instructions with Corresponding Video Segments [O] . Iftekhar Naim, Young Chol Song, Qiguang Liu, 2016

机译：自然语言指令与相应视频片段的判别无监督对齐
8. Instructional Videos for Unsupervised Harvesting and Learning of Action Examples. [R] . Yu, S., Jiang, L., Hauptmann, A. 2014

机译：无监督收获和学习行动范例的教学视频。

Discriminative Unsupervised Alignment of Natural Language Instructions with Corresponding Video Segments

摘要

著录项

相似文献

相关主题

期刊订阅