Deep Learning for Image-to-Text Generation: A Technical Overview

Xiaodong He; Li Deng

首页> 外文期刊>IEEE Signal Processing Magazine >Deep Learning for Image-to-Text Generation: A Technical Overview

【24h】

Deep Learning for Image-to-Text Generation: A Technical Overview

机译：用于图像到文本生成的深度学习：技术概述

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Generating a natural language description from an image is an emerging interdisciplinary problem at the intersection of computer vision, natural language processing, and artificial intelligence (AI). This task, often referred to as image or visual captioning, forms the technical foundation of many important applications, such as semantic visual search, visual intelligence in chatting robots, photo and video sharing in social media, and aid for visually impaired people to perceive surrounding visual content. Thanks to the recent advances in deep learning, the AI research community has witnessed tremendous progress in visual captioning in recent years. In this article, we will first summarize this exciting emerging visual captioning area. We will then analyze the key development and the major progress the community has made, their impact in both research and industry deployment, and what lies ahead in future breakthroughs.

机译：从图像生成自然语言描述是计算机视觉，自然语言处理和人工智能（AI）的交叉领域中一个新兴的跨学科问题。此任务通常被称为图像或视觉字幕，它构成了许多重要应用程序的技术基础，例如语义视觉搜索，聊天机器人中的视觉智能，社交媒体中的照片和视频共享，以及帮助视障人士感知周围环境视觉内容。得益于深度学习的最新进展，近年来，人工智能研究界见证了视觉字幕的巨大进步。在本文中，我们将首先总结这个令人兴奋的新兴视觉字幕领域。然后，我们将分析社区的关键发展和主要进展，它们对研究和行业部署的影响以及未来突破的前景。

著录项

来源
《IEEE Signal Processing Magazine》 |2017年第6期|109-116|共8页
作者
Xiaodong He; Li Deng;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Visualization; Semantics; Image classification; Training data; Pediatrics; Artificial intelligence; Computer vision; Natural language processing;

机译：可视化;语义学;图像分类;训练数据;儿科;人工智能;计算机视觉;自然语言处理;

相似文献

外文文献
中文文献
专利

1. Technical and clinical overview of deep learning in radiology [J] . Ueda Daiju, Shimazaki Akitoshi, Miki Yukio Japanese journal of radiology . 2019,第1期

机译：放射学深度学习的技术和临床概述
2. An Overview of Deep Learning Architecture of Deep Neural Networks and Autoencoders [J] . Journal of computational and theoretical nanoscience . 2020,第1期

机译：深度神经网络和自动化园区深度学习架构概述
3. Deep learning, machine learning and internet of things in geophysical engineering applications: An overview [J] . Dimililer Kamil, Dindar Hilmi, Al-Turjman Fadi Microprocessors and microsystems . 2021,第Feba期

机译：地球物理工程应用中的深度学习，机器学习和事物互联网：概述
4. Automatic Generation of Rescheduling Knowledge in Socio-technical Manufacturing Systems using Deep Reinforcement Learning [C] . Jorge A. Palombarini, Ernesto C. Martínez IEEE Biennial Congress of Argentina . 2018

机译：使用深度强化学习在社交技术制造系统中自动生成重新安排的知识
5. An Overview of Probabilistic Latent Variable Models with an Application to the Deep Unsupervised Learning of Chromatin States [D] . Farouni, Tarek. 2017

机译：概率潜在变量模型的概述及其在染色质状态的深度无监督学习中的应用
6. Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy [O] . Michelle Y. T. Yip, Gilbert Lim, Zhan Wei Lim, 2020

机译：影响糖尿病视网膜病变深度学习系统性能的技术和影像学因素
7. An overview on data representation learning: From traditional feature learning to recent deep learning [O] . Guoqiang Zhong, Li-Na Wang, Xiao Ling, 2016

机译：数据表示学习概述：从传统的特征学习到最近的深度学习

Deep Learning for Image-to-Text Generation: A Technical Overview

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅