首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
【24h】

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data

机译:深度合成字幕:描述没有配对训练数据的新颖对象类别

获取原文

摘要

While recent deep neural network models have achieved promising results on the image captioning task, they rely largely on the availability of corpora with paired image and sentence captions to describe objects in context. In this work, we propose the Deep Compositional Captioner (DCC) to address the task of generating descriptions of novel objects which are not present in paired imagesentence datasets. Our method achieves this by leveraging large object recognition datasets and external text corpora and by transferring knowledge between semantically similar concepts. Current deep caption models can only describe objects contained in paired image-sentence corpora, despite the fact that they are pre-trained with large object recognition datasets, namely ImageNet. In contrast, our model can compose sentences that describe novel objects and their interactions with other objects. We demonstrate our model's ability to describe novel concepts by empirically evaluating its performance on MSCOCO and show qualitative results on ImageNet images of objects for which no paired image-sentence data exist. Further, we extend our approach to generate descriptions of objects in video clips. Our results show that DCC has distinct advantages over existing image and video captioning approaches for generating descriptions of new objects in context.
机译:尽管最近的深度神经网络模型在图像字幕任务上取得了可喜的成果,但它们很大程度上依赖于具有配对图像和句子字幕的语料库来描述上下文中的对象。在这项工作中,我们提出了深度合成字幕机(DCC),以解决生成在成对的图像句子数据集中不存在的新颖对象的描述的任务。我们的方法通过利用大型对象识别数据集和外部文本语料库以及在语义相似的概念之间传递知识来实现​​此目的。当前的深字幕模型只能描述成对的图像句子语料库中包含的对象,尽管它们已通过大型对象识别数据集(即ImageNet)进行了预训练。相反,我们的模型可以组成描述新颖对象及其与其他对象的相互作用的句子。我们通过经验评估模型在MSCOCO上的性能来证明模型描述新颖概念的能力,并在不存在成对图像句数据的对象的ImageNet图像上显示定性结果。此外,我们扩展了方法以生成视频剪辑中对象的描述。我们的结果表明,DCC与现有的图像和视频字幕方法相比,在上下文中生成新对象的描述方面具有明显的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号