首页> 外文会议>Annual conference on Neural Information Processing Systems >Expressing an Image Stream with a Sequence of Natural Sentences
【24h】

Expressing an Image Stream with a Sequence of Natural Sentences

机译:用自然句序列表达图像流

获取原文

摘要

We propose an approach for retrieving a sequence of natural sentences for an image stream. Since general users often take a series of pictures on their special moments, it would better take into consideration of the whole image stream to produce natural language descriptions. While almost all previous studies have dealt with the relation between a single image and a single natural sentence, our work extends both input and output dimension to a sequence of images and a sequence of sentences. To this end, we design a multimodal architecture called coherence recurrent convolutional network (CRCN), which consists of convolutional neural networks, bidirectional recurrent neural networks, and an entity-based local coherence model. Our approach directly learns from vast user-generated resource of blog posts as text-image parallel training data. We demonstrate that our approach outperforms other state-of-the-art candidate methods, using both quantitative measures (e.g. BLEU and top-K recall) and user studies via Amazon Mechanical Turk.
机译:我们提出了一种检索图像流的自然句子序列的方法。由于一般用户经常在他们的特殊时刻拍摄一系列照片,因此最好考虑整个图像流以产生自然的语言描述。尽管几乎所有以前的研究都处理了单个图像和单个自然句子之间的关系,但我们的工作将输入和输出维度扩展到了图像序列和句子序列。为此,我们设计了一种称为相干递归卷积网络(CRCN)的多模式体系结构,它由卷积神经网络,双向递归神经网络和基于实体的局部相干模型组成。我们的方法直接从大量用户生成的博客文章资源中学习文本和图像并行训练数据。我们证明了我们的方法在定量方法(例如BLEU和top-K召回)以及通过Amazon Mechanical Turk进行的用户研究中均优于其他最新的候选方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号