首页> 外文会议>Conference on empirical methods in natural language processing >End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space
【24h】

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space

机译:端到端图像标题利用多模式空间中的分布相似性

获取原文

摘要

Image description generation, or image captioning (IC), is the task of automatically generating a textual description for a given image. The generated text is expected to describe, generally in a single sentence, what is visually depicted in the image, for example the entities/objects present in the image, their attributes, the actions/activities performed, entity/object interactions (including quantification), the location/scene, etc. (e.g. "a man riding a bike on the street"). Significant progress has been made with end-to-end approaches to tackling this problem, where parallel image-description datasets such as Flickr30k (Young et al., 2014) and MSCOCO (Chen et al., 2015) are used to train a CNN-RNN based neural network IC system (Vinyals et al., 2017; Karpathy and Fei-Fei, 2015; Xu et al., 2015). Such systems have demonstrated impressive performance in the COCO captioning challenge according to automatic metrics, seemingly even surpassing human performance in many instances (e.g. CIDEr score > 1.0 vs. human's 0.85) (Chen et al., 2015). However, in reality, the performance of end-to-end systems is still far from satisfactory according to metrics based on human judgement. This task is thus currently far from being a solved problem.
机译:图像描述生成或图像字幕(IC)是自动生成给定图像的文本描述的任务。期望生成的文本通常在单句中描述,图像中的视觉上描绘的内容,例如图像中存在的实体/对象,它们的属性,执行的动作/活动,实体/对象交互(包括量化) ,位置/场景等(例如,“骑自行车在街上的男子”)。在解决这个问题的结束方法方面取得了重大进展,其中Plickr30k(Young等,2014)和Mscoco(Chen等,2015)等平行图像描述数据集用于培训CNN基于-RNN的神经网络IC系统(Vinyals等,2017; Karpataly和Fei-Fei,2015; Xu等人,2015)。根据自动指标,这些系统在Coco标题挑战中表现出令人印象深刻的性能,看似甚至超过了许多情况下的人类性能(例如,苹果酒得分> 1.0与人类的0.85)(Chen等,2015)。然而,实际上,根据人类判断的指标,终端到最终系统的性能仍然远非令人满意。因此,此任务目前远未成为一个解决问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号