Expressing Visual Relationships via Language

机译：通过语言表达视觉关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Describing images with text is a fundamental problem in vision-language research. Current studies in this domain mostly focus on single image captioning. However, in various real applications (e.g., image editing, difference interpretation, and retrieval), generating relational captions for two images, can also be very useful. This important problem has not been explored mostly due to lack of datasets and effective models. To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions. We then propose a new relational speaker model based on an encoder-decoder architecture with static relational attention and sequential multi-head attention. We also extend the model with dynamic relational attention, which calculates visual alignment while decoding. Our models are evaluated on our newly collected and two public datasets consisting of image pairs annotated with relationship sentences. Experimental results, based on both automatic and human evaluation, demonstrate that our model outperforms all baselines and existing methods on all the datasets.~1

机译：用文本描述图像是视觉语言研究中的一个基本问题。当前在该领域的研究主要集中在单图像字幕上。但是，在各种实际应用中（例如，图像编辑，差异解释和检索），为两个图像生成关系字幕也可能非常有用。由于缺少数据集和有效的模型，因此尚未探讨此重要问题。为了朝这个方向推进研究，我们首先引入了一个新的语言指导的图像编辑数据集，其中包含大量的真实图像对以及相应的编辑指令。然后，我们基于具有静态关系注意力和顺序多头注意力的编解码器体系结构，提出了一种新的关系说话者模型。我们还通过动态关系关注扩展了模型，该关系关注计算解码时的视觉对齐方式。我们的模型是在我们新收集的两个公共数据集上进行评估的，两个公共数据集由带关系语句注释的图像对组成。基于自动评估和人工评估的实验结果表明，我们的模型优于所有数据集上的所有基线和现有方法。〜1

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|1873-1883|共11页
会议地点
作者
Hao Tan; Franck Dernoncourt; Zhe Lin; Trung Bui; Mohit Bansal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:54:09

相似文献

外文文献
中文文献
专利

1. Mining Inter-Relationships in Online Scientific Articles and its Visualization: Natural Language Processing for Systems Biology Modeling [J] . Nidheesh Melethadathil, Jaap Heringa, Bipin Nair, International journal of online engineering . 2019,第02期

机译：在线科学文章中的相互关系挖掘及其可视化：系统生物学建模的自然语言处理
2. A visual language for the evolution of spatial relationships and its translation into a spatio-temporal calculus [J] . Martin Erwig, Markus Schneider Journal of Visual Languages & Computing . 2003,第2期

机译：一种可视化语言，用于空间关系的演变并将其转换为时空演算
3. VOQL: a Visual Object-oriented database Query Language for visualizing path expressions [J] . Jeonghee Kim, Taisook Han, Suk Kyoon Lee International Journal of Computer Systems Science & Engineering . 2000,第4期

机译：VOQL：面向可视对象的数据库查询语言，用于可视化路径表达式
4. Expressing Visual Relationships via Language [C] . Hao Tan, Franck Dernoncourt, Zhe Lin, Annual meeting of the Association for Computational Linguistics . 2019

机译：通过语言表达视觉关系
5. Evaluating the Relationships between Job Satisfaction, Expressive Language, and Psychological Flexibility: Correlating the Valued Living Questionnaire, Job Satisfaction Survey, and the PEAK-T Expressive Language Pre-assessment [D] . Vaughn, Amanda Lynn 2019

机译：评估工作满意度，表达语言和心理灵活性之间的关系：关联有价值的生活问卷，工作满意度调查和PEAK-T表达语言预评估
6. Visual and Auditory fMRI Paradigms for Presurgical Language Mapping: Convergent Validity and Relationship to Individual Variables [O] . Antonina Omisade, Christopher B. OGrady, Matthias H. Schmidt, 2019

机译：术前语言映射的视觉和听觉fMRI范例：收敛性及其与各个变量的关系
7. Expressing Visual Relationships via Language [O] . Hao Tan, Franck Dernoncourt, Zhe Lin, 2019

机译：通过语言表达视觉关系
8. A SYNTAX-ORIENTED COMPILER FOR LANGUAGES WHOSE SYNTAX IS EXPRESSIBLE IN BACKUS NORMAL FORM, AND SOME PROPOSED EXTENSIONS THERETO [R] . Peter Zilahy Ingerman 1963

机译：一个以语法为导向的语言编写器，其语法是以正常形式表达的，并且有一些建议的延伸

Expressing Visual Relationships via Language

摘要

著录项

相似文献

相关主题

期刊订阅