首页> 外文期刊>Neurocomputing >Phrase-based image caption generator with hierarchical LSTM network
【24h】

Phrase-based image caption generator with hierarchical LSTM network

机译:具有分层LSTM网络的基于短语的图像标题生成器

获取原文
获取原文并翻译 | 示例

摘要

Automatic generation of caption to describe the content of an image has been gaining a lot of research interests recently, where most of the existing works treat the image caption as pure sequential data. Natural language, however possess a temporal hierarchy structure, with complex dependencies between each subsequence. In this paper, we propose a phrase-based image captioning model using a hierarchical Long Short-Term Memory (phi-LSTM) architecture to generate image description. In contrast to the conventional solutions that generate caption in a pure sequential manner, phi-LSTM decodes image caption from phrase to sentence. It consists of a phrase decoder to decode the noun phrases of variable length, and an abbreviated sentence decoder to decode the abbreviated form of the image description. A complete image caption is formed by combining the generated phrases with sentence during the inference stage. Empirically, our proposed model shows a better or competitive result on the Flickr8k, Flickr30k and MS-COCO datasets in comparison to the state-of-the art models. We also show that our proposed model is able to generate more novel captions (not seen in the training data) which are richer in word contents in all these three datasets. (C) 2018 Elsevier B.V. All rights reserved.
机译:自动生成用于描述图像内容的标题的方法最近引起了许多研究兴趣,其中大多数现有作品将图像标题视为纯连续数据。但是,自然语言具有时间层次结构,每个子序列之间具有复杂的依存关系。在本文中,我们提出了一种基于短语的图像字幕模型,该模型使用分层的长期短期记忆(phi-LSTM)体系结构来生成图像描述。与以纯顺序方式生成字幕的常规解决方案相比,phi-LSTM从短语到句子对图像字幕进行解码。它由一个短语解码器(用于解码可变长度的名词短语)和一个缩写句子解码器(用于解码图像描述的缩写形式)组成。通过在推理阶段将生成的短语与句子组合在一起,可以形成完整的图像标题。根据经验,与最新模型相比,我们提出的模型在Flickr8k,Flickr30k和MS-COCO数据集上显示出更好或更具竞争力的结果。我们还表明,我们提出的模型能够生成更多新颖的标题(在训练数据中看不到),在所有这三个数据集中,这些标题的单词内容更加丰富。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号