首页> 外文会议>Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge >What Did This Castle Look Like Before? Exploring Referential Relations in Naturally Occurring Multimodal Texts
【24h】

What Did This Castle Look Like Before? Exploring Referential Relations in Naturally Occurring Multimodal Texts

机译:这座城堡在之前看起来像什么? 探索天然多数制文本中的参考关系

获取原文

摘要

Multi-modal texts are abundant and diverse in structure, yet Vision & Language research of these naturally occurring texts has mostly focused on genres that are comparatively light on text, like tweets. In this paper, we discuss the challenges and potential benefits of a V&L framework that explicitly models referential relations, taking Wikipedia articles about buildings as an example. We briefly survey existing related tasks in V&L and propose multi-modal information extraction as a general direction for future research.
机译:多模态文本的结构丰富,结构多样化,但这些自然的文本的视觉和语言研究大多专注于文本相对轻微的流派,如推文。 在本文中,我们讨论了V&L框架的挑战和潜在利益,明确地模拟了参考关系,以维基百科文章为例。 我们简要介绍了V&L中存在的相关任务,并提出了多模态信息提取作为未来研究的一般方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号