首页> 外文会议>IEEE International Conference on Computer Vision >Visual Madlibs: Fill in the Blank Description Generation and Question Answering
【24h】

Visual Madlibs: Fill in the Blank Description Generation and Question Answering

机译:Visual Madlibs:填写空白描述生成和问题解答

获取原文

摘要

In this paper, we introduce a new dataset consisting of 360,001 focused natural language descriptions for 10,738 images. This dataset, the Visual Madlibs dataset, is collected using automatically produced fill-in-the-blank templates designed to gather targeted descriptions about: people and objects, their appearances, activities, and interactions, as well as inferences about the general scene or its broader context. We provide several analyses of the Visual Madlibs dataset and demonstrate its applicability to two new description generation tasks: focused description generation, and multiple-choice question-answering for images. Experiments using joint-embedding and deep learning methods show promising results on these tasks.
机译:在本文中,我们介绍了一个由360,001个聚焦的自然语言描述组成的新数据集,用于10,738个图像。此数据集是使用自动产生的填充空白模板来收集此数据集,该数据集是收集的,该模板旨在收集有关:人员和对象,他们的外表,活动和互动以及关于一般场景的推论的目标描述更广泛的背景。我们提供了若干视觉Madlibs DataSet的分析,并展示其适用于两个新描述生成任务:重点描述生成,以及图像的多项选择回答。使用联合嵌入和深度学习方法的实验表明了这些任务的有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号