Improving Caption Consistency to Image with Semantic Filter by Adversarial Training

机译：通过对抗性培训改进与语义过滤器的图像的标题一致性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Benefiting from the larger-scale dataset, image captioning has achieved remarkable success to generate more humanlike captions. However, for the specific tasks (e.g., stylized image captioning) trained with the small-scale dataset, the visual objects and semantic diversity are generally insufficient. Although the generated captions are suitable, it still lacks in depicting the image with comprehensive visual objects, which leads to a reduction in the fluency and accuracy expressions. To address this issue, we proposed an image captioning system based on an adversarial training strategy. To improve the accuracy, a semantic filter module is implemented to obtain the informative context from the semantic vectors. With a two-separated LSTM architecture, our model learns the image features and semantic vectors at the global and local levels. Through adversarial training, the generated caption can be integrated with accurate information and expressed in a fluent style. Experiment results show the outstanding performance of our approach to capture semantic knowledge on the FlickrStyle10K dataset. The linguistic analysis demonstrates our model succeeds in improving the accuracy and fluency of generated captions.

机译：从较大级别的数据集中受益，图像标题已经取得了显着的成功，以产生更多人类的字幕。然而，对于使用小规模数据集接受训练的特定任务（例如，风格化图像标题），视觉对象和语义多样性通常不足。虽然所生成的标题是合适的，但仍然缺乏描绘具有综合视觉物体的图像，这导致流畅性和准确性表达的降低。为了解决这个问题，我们提出了一种基于对抗培训策略的图像标题系统。为了提高准确性，实现了语义滤波器模块以从语义向量中获取信息性的上下文。通过双分隔的LSTM架构，我们的模型在全球和本地层面学习图像特征和语义向量。通过对抗性培训，所生成的标题可以与准确的信息集成，并以流畅的风格表示。实验结果表明我们在Flickrstyle10k数据集中捕获语义知识的方法的出色表现。语言分析表明我们的模型成功地提高了生成标题的准确性和流畅性。

著录项

来源
《IEEE International Conference on Mechatronics and Automation》|2021年|269-274|共6页
会议地点
作者
Junlong Feng; Jianping Zhao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Visualization; Mechatronics; Conferences; Semantics; Linguistics; Information filters;

机译：培训;可视化;机电一体化;会议;语义;语言学;信息过滤器;

相似文献

外文文献
中文文献
专利

1. Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning [J] . Xu Chunpu, Yang Min, Ao Xiang, Knowledge-Based Systems . 2021,第Feba28期

机译：测验增强的对抗对抗训练，具有动态内存增强的图像段标题
2. Refining Synthetic Images with Semantic Layouts by Adversarial Training [J] . Tongtong Zhao, Yuxiao Yan, JinJia Peng, JMLR: Workshop and Conference Proceedings . 2018,第4期

机译：通过对抗训练细化具有语义布局的合成图像
3. Automated semantic indexing of figure captions to improve radiology image retrieval. [J] . Kahn-CE Jr, Rubin DL Journal of the American Medical Informatics Association : . 2009,第3期

机译：图形标题的自动语义索引，以改善放射图像的检索。
4. Towards Generating Stylized Image Captions via Adversarial Training [C] . Omid Mohamad Nezami, Mark Dras, Stephen Wan, Pacific Rim international conference on artificial intelligence . 2019

机译：通过对抗训练来生成风格化的图像字幕
5. Semantic Segmentation for Producing Nuclei Stained Images Using Conditional Generative Adversarial Networks [D] . Bhatia, Tanmay . 2019

机译：使用条件生成对抗性网络产生核染色图像的语义分割
6. Automated Semantic Indexing of Figure Captions to Improve Radiology Image Retrieval [O] . Charles E. Kahn Jr., Daniel L. Rubin 2009

机译：图形字幕的自动语义索引可改善放射图像的检索
7. Adversarial Semantic Alignment for Improved Image Captions [O] . Pierre Dognin, Igor Melnyk, Youssef Mroueh, 2019

机译：对改进图像标题的普发阿拉利语义对齐

Improving Caption Consistency to Image with Semantic Filter by Adversarial Training

摘要

著录项

相似文献

相关主题

期刊订阅