Multimodal Image Captioning Through Combining Reinforced Cross Entropy Loss and Stochastic Deprecation

机译：结合交叉交叉熵损失和随机弃用的多模态图像字幕

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, Cross Entropy Loss (CEL) has been proved to be useful in encoder-decoder based multimodal image captioning; however, it still faces the difficulty of inconsistency between optimizing function and evaluation metrics. In this paper, we propose a new approach for multimodal image captioning. It consists of 1) Reinforced Cross Entropy Loss (RCEL) to maximize the probability of ground truth captions and optimize evaluation metrics directly, and 2) Stochastic Deprecation (SD) to automatically select high-quality ground truth sentences without losing the diversity of corpus. The proposed RCEL and SD are generic and can improve the existing natural language generation models while combining them (RCEL-SD) can achieve the best result. Experimental results on the benchmark MSCOCO dataset show that the proposed RCEL-SD respectively outperforms CEL in terms of all the 7 evaluation metrics on three recent image captioning models.

机译：最近，已证明交叉熵损失（CEL）可用于基于编码器 - 解码器的多模式图像标题;但是，它仍然面临优化功能和评估度量之间不一致的难度。在本文中，我们提出了一种新的多模式图像标题方法。它由1）加强跨熵损失（RCEL）直接最大化地面真理标题和优化评估度量的概率，以及2）随机弃用（SD）自动选择高质量的地面真理句而不会失去语料库的多样性。所提出的RCEL和SD是通用的，可以在组合它们（RCE-SD）时改进现有的自然语言生成模型可以达到最佳结果。基准Mscoco数据集上的实验结果表明，所提出的RCE-SD分别在三个近期图像标题模型上的所有7个评估度量方面优于Celforms。

著录项

来源
《IEEE International Conference on Multimedia and Expo》|2019年|1318-1323|共6页
会议地点
作者
Xi Meng; Hao Kong; Dongqi Tang; Tong Lu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Measurement; Entropy; Stochastic processes; Training; Sports; Decoding; Natural languages;

机译：测量;熵;随机过程;训练;体育;解码;自然语言;

相似文献

外文文献
中文文献
专利

1. Truncation Cross Entropy Loss for Remote Sensing Image Captioning [J] . Li Xuelong, Zhang Xueting, Huang Wei, IEEE Transactions on Geoscience and Remote Sensing . 2021,第6期

机译：遥感图像标题的截断交叉熵损耗
2. From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning [J] . Song Jingkuan, Guo Yuyu, Gao Lianli, Neural Networks and Learning Systems, IEEE Transactions on . 2019,第10期

机译：从确定性到生成性：用于视频字幕的多模式随机RNN
3. Retinal vessel segmentation of color fundus images using multiscale convolutional neural network with an improved cross-entropy loss function [J] . Hu Kai, Zhang Zhenzhen, Niu Xiaorui, Neurocomputing . 2018,第OCTa2期

机译：使用具有改进的交叉熵损失功能的多尺度卷积神经网络对彩色眼底图像进行视网膜血管分割
4. Multimodal Image Captioning Through Combining Reinforced Cross Entropy Loss and Stochastic Deprecation [C] . Xi Meng, Hao Kong, Dongqi Tang, IEEE International Conference on Multimedia and Expo . 2019

机译：通过组合增强跨熵损失和随机弃用的多模式图像标题
5. Stochastic J-integral and reliability of composite laminates based on a computational methodology combining experimental investigation, stochastic finite element analysis and maximum entropy method. [D] . Pondugala, Lakshmi Vara Prasad. 2000

机译：基于实验研究，随机有限元分析和最大熵方法相结合的计算方法，对复合材料层板的随机J积分和可靠性进行了研究。
6. Spiking Cortical Model Based Multimodal Medical Image Fusion by Combining Entropy Information with Weber Local Descriptor [O] . Xuming Zhang, Jinxia Ren, Zhiwen Huang, 2016

机译：熵信息与Weber局部描述符相结合的基于尖峰皮质模型的多峰医学图像融合
7. Rapid Fire Abstract: Multimodality imaging valvular heart disease742Quantification of aortic regurgitation by pulsed Doppler examination of the left subclavian artery velocity contour: a validation study with cardiac magnetic resonance imaging743Diastolic retrograde flow in the descending aorta by cardiovascular magnetic resonance imaging for the quantification of aortic regurgitation744Native T1 relaxation time can accurately identify limited left ventricular contractile reserve in patients with aortic stenosis745The validation and assessment of myocardial fibrosis by using cardiac magnetic resonance and speckle-tracking echocardiography in severe aortic stenosis746Clinical validation of a semi-automatic quantification score of aortic valve calcification with ultrasound747A comparison among conventional 3D-transesophageal echocardiography manual analysis, 3D automatic software analysis and computed tomography for the aortic annulus sizing in TAVI patients748New insights from a multimodality imaging evaluation of LV remodeling in patients with chronic ischemic mitral regurgitation: a combined magnetic resonance and speckle tracking analysis749Multimodality imaging monitoring during percutaneous tricuspid valve repair procedures [O] . RA. Spampinato, A. Kammerlander, T. Ondrus, 2016

机译：快速火灾摘要：脉冲多普勒检测左锁骨期动脉轮廓脉冲多普勒检查的多模成像瓣膜病程742，用心血管磁共振成像与心血管磁共振成像中的心脏磁共振成像对心血管磁共振成像的验证研究。主动脉磁性共振成像7444放松时间可以在主动脉狭窄患者中准确识别有限的左心室收缩储备，通过使用心脏磁共振和散斑狭窄的斑点狭窄746临床验证，对主动脉瓣钙化与超声阀钙化的验证和散斑跟踪超声心动术进行验证和评估心肌纤维化的验证和评估。在常规的3D-经乳管超声心动图中，3D自动软件分析和计算断层扫描的主动脉环抑制在Tavi患者中的尺寸748新的洞察力组合的磁共振和斑点追踪analysis749Multimodality成像期间经皮三尖瓣修复过程监测：在患者重塑慢性缺血性二尖瓣反流的LV dality成像评价
8. Multimodality Image-Guided HDR/IMRT in Prostate Cancer: Combined Molecular Targeting Using Nanoparticle MR, 3D MRSI, and 11C Acetate PET Imaging [R] . Kurdziel, K. , Hagan, M. , Williamson, J. , 2005

机译：前列腺癌中的多模态图像引导HDR / ImRT：使用纳米颗粒mR，3D mRsI和11C醋酸酯pET成像的组合分子靶向

Multimodal Image Captioning Through Combining Reinforced Cross Entropy Loss and Stochastic Deprecation

摘要

著录项

相似文献

相关主题

期刊订阅