Unsupervised Textual Grounding: Linking Words to Image Concepts

机译：无监督的文本基础：将单词链接到图像概念

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Textual grounding, i.e., linking words to objects in images, is a challenging but important task for robotics and human-computer interaction. Existing techniques benefit from recent progress in deep learning and generally formulate the task as a supervised learning problem, selecting a bounding box from a set of possible options. To train these deep net based approaches, access to a large-scale datasets is required, however, constructing such a dataset is time-consuming and expensive. Therefore, we develop a completely unsupervised mechanism for textual grounding using hypothesis testing as a mechanism to link words to detected image concepts. We demonstrate our approach on the ReferIt Game dataset and the Flickr30k data, outperforming baselines by 7.98% and 6.96% respectively.

机译：文本基础，即将单词链接到图像中的对象，对于机器人技术和人机交互是一项具有挑战性但重要的任务。现有技术得益于深度学习的最新进展，并且通常将任务表述为有监督的学习问题，并从一组可能的选项中选择一个边界框。为了训练这些基于深网的方法，需要访问大规模数据集，但是，构建这样的数据集既耗时又昂贵。因此，我们使用假设检验作为将单词链接到检测到的图像概念的机制，开发了一种完全不受监督的文本基础化机制。我们在ReferIt Game数据集和Flickr30k数据上展示了我们的方法，分别比基准高出7.98％和6.96％。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|6125-6134|共10页
会议地点 Salt Lake City(US)
作者
Raymond A. Yeh; Minh N. Do; Alexander G. Schwing;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Grounding; Task analysis; Visualization; Proposals; Training; Object detection; Feature extraction;

机译：接地；任务分析；可视化；提案；训练;对象检测；特征提取;
入库时间 2022-08-26 14:35:28

相似文献

外文文献
中文文献
专利

1. Linking Images and Words: the Description of Specialized Concepts1 [J] . Pamela Faber Pilar León Araúz Juan Antonio Prieto Velasco and Arianne Reimerink International Journal of Lexicography . 2007,第1期

机译：链接图像和单词：专门概念的描述 1
2. Reverse Keyword Search for Spatio-Textual Top- src="/images/tex/348.gif" alt="k"> Queries in Location-Based Services [J] . Lin Xin, Xu Jianliang, Hu Haibo Knowledge and Data Engineering, IEEE Transactions on . 2015,第11期

机译：反向关键字搜索，用于时空文本顶部- src =“ / images / tex / 348.gif” alt =“ k”> 基于位置的服务中的查询
3. Unsupervised Alignment of News Video and Text Using Visual Patterns and Textual Concepts [J] . Multimedia, IEEE Transactions on . 2011,第2期

机译：使用视觉模式和文本概念的新闻视频和文本的无监督对齐
4. Unsupervised Textual Grounding: Linking Words to Image Concepts [C] . Raymond A. Yeh, Minh N. Do, Alexander G. Schwing IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：无监督的文本接地：将单词链接到图像概念
5. Juxtaposing words and images: Using digital narratives to capture teachers' conceptions of literacy. [D] . Weber, Catherine M. 2010

机译：文字和图像并列：使用数字叙事捕捉教师的素养概念。
6. ClinRefLink: Implementation of Infobutton-like Functionality in a Commercial Clinical Information System Incorporating Concepts From Textual Documents [O] . Michael I. Oppenheim, Debra Rand, Catherine Barone, 2009

机译：ClinRefLink：结合文本文档概念的商业临床信息系统中类似Infobutton的功能的实现
7. Unsupervised Textual Grounding: Linking Words to Image Concepts [O] . Raymond A. Yeh, Minh N. Do, Alexander G. Schwing 2018

机译：无监督的文本接地：将单词链接到图像概念
8. Relating Images, Concepts, and Words [R] . Waltz, D. L. 1979

机译：关联图像，概念和单词

Unsupervised Textual Grounding: Linking Words to Image Concepts

摘要

著录项

相似文献

相关主题

期刊订阅