首页> 外文期刊>IEEE transactions on multimedia >COMIC: Toward A Compact Image Captioning Model With Attention
【24h】

COMIC: Toward A Compact Image Captioning Model With Attention

机译:COMIC:引起注意的紧凑型图像字幕模型

获取原文
获取原文并翻译 | 示例
           

摘要

Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to deploy on embedded systems with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of vocabulary, adversely affecting the compactness of these networks. To address this limitation, this paper introduces a brand new idea in the domain of image captioning. That is, we tackle the problem of compactness of image captioning models which is hitherto unexplored. We showed that our proposed model, named COMIC for compact image captioning, achieves comparable results in five common evaluation metrics with state-of-the-art approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an embedded vocabulary size that is 39x-99x smaller.
机译:图像字幕的最新工作显示出非常有希望的原始性能。但是,我们认识到,大多数这些受关注的编解码器样式的网络无法自然地扩展到大词汇量,这使得它们难以部署在硬件资源有限的嵌入式系统上。这是因为单词和输出嵌入矩阵的大小与词汇表的大小成比例地增长,从而不利地影响了这些网络的紧凑性。为了解决这个限制,本文在图像字幕领域引入了一个崭新的想法。也就是说,我们解决了迄今为止尚未开发的图像字幕模型的紧凑性问题。我们展示了我们提出的名为COMIC的紧凑图像字幕模型,该模型在MS-COCO和InstaPIC-1.1M数据集上均采用最新技术,在五个常用评估指标中均获得了可比的结果,尽管其嵌入式词汇量为缩小39x-99x。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号