首页> 外文期刊>Pattern recognition letters >How much do cross-modal related semantics benefit image captioning by weighting attributes and re-ranking sentences?
【24h】

How much do cross-modal related semantics benefit image captioning by weighting attributes and re-ranking sentences?

机译:跨模式相关语义有多少通过加权属性和重新排名句子来效益图像标题?

获取原文
获取原文并翻译 | 示例

摘要

Although image description generated from attributes achieves big progress, there are still two main problems to be solved further: (1) How to obtain more accurate attributes? (2) How to mitigate the difference between sentence generation and evaluation. To address these issues, we propose a new method to incorporate the cross-modal related semantics into the encoder-decoder structure for image captioning. In the encoding stage, we utilize the salient words derived from cross-modal retrieval to improve the accuracy of attributes. In the decoding stage, we explore two ways to re-rank the sentences generated through beam search with the guidance of semantics acquired through a modified cross-modal retrieval method. The evaluation results on the benchmark dataset MS-COCO in both offline and online prove the benefits of cross-modal related semantics on image captioning. (C) 2019 Elsevier B.V. All rights reserved.
机译:虽然从属性生成的图像描述实现了大的进展,但仍有两个主要问题要进一步解决:(1)如何获得更准确的属性? (2)如何减轻句子生成和评估之间的差异。为了解决这些问题,我们提出了一种新方法,将跨模型相关语义结合到用于图像标题的编码器 - 解码器结构中。在编码阶段,我们利用来自跨模型检索的突出词来提高属性的准确性。在解码阶段,我们探讨了通过通过修改的跨模型检索方法获取的语义的指导来重新排列通过光束搜索生成的句子的两种方法。在离线和在线的基准数据集MS-Coco上的评估结果证明了跨模型相关语义对图像标题的好处。 (c)2019 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号