...
首页> 外文期刊>IEEE Signal Processing Magazine >Deep Learning for Image-to-Text Generation: A Technical Overview
【24h】

Deep Learning for Image-to-Text Generation: A Technical Overview

机译:用于图像到文本生成的深度学习:技术概述

获取原文
获取原文并翻译 | 示例
           

摘要

Generating a natural language description from an image is an emerging interdisciplinary problem at the intersection of computer vision, natural language processing, and artificial intelligence (AI). This task, often referred to as image or visual captioning, forms the technical foundation of many important applications, such as semantic visual search, visual intelligence in chatting robots, photo and video sharing in social media, and aid for visually impaired people to perceive surrounding visual content. Thanks to the recent advances in deep learning, the AI research community has witnessed tremendous progress in visual captioning in recent years. In this article, we will first summarize this exciting emerging visual captioning area. We will then analyze the key development and the major progress the community has made, their impact in both research and industry deployment, and what lies ahead in future breakthroughs.
机译:从图像生成自然语言描述是计算机视觉,自然语言处理和人工智能(AI)的交叉领域中一个新兴的跨学科问题。此任务通常被称为图像或视觉字幕,它构成了许多重要应用程序的技术基础,例如语义视觉搜索,聊天机器人中的视觉智能,社交媒体中的照片和视频共享,以及帮助视障人士感知周围环境视觉内容。得益于深度学习的最新进展,近年来,人工智能研究界见证了视觉字幕的巨大进步。在本文中,我们将首先总结这个令人兴奋的新兴视觉字幕领域。然后,我们将分析社区的关键发展和主要进展,它们对研究和行业部署的影响以及未来突破的前景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号