首页> 外文会议>International Conference on Smart Systems and Inventive Technology >A Novel Convolutional Neural Network-Gated Recurrent Unit approach for Image Captioning
【24h】

A Novel Convolutional Neural Network-Gated Recurrent Unit approach for Image Captioning

机译:一种新颖的卷积神经网络门控递归单元图像字幕

获取原文

摘要

Image captioning is a concept of generating a textual description for an image. It involves Machine Learning techniques like Natural Language Processing and Computer Vision to produce appropriate descriptions for images. Image Captioning has several applications in today's world of ever-expanding data such as Application Recommendation, Virtual Assistance, Image Indexing, and in Social Media. Image captioning can also help us in automating the job of interpreting images and in describing a visual scene to the visually impaired. Image Captioning has been dispensable in driving the Human-Computer Interaction field. Our Research paper proposes a CNN-GRU based framework for training using large datasets of Images and Captions and generating accurate caption descriptions for new images. A dictionary of photo identifiers is built based on descriptions to convert these descriptions into a vocabulary of words and built their list. A VGG-16 Convolution Neural Network has been proposed as our feature extractor and a Gated Recurrent Unit - Recurrent Neural Network as our Sequence Processor. Our model gives us an accuracy of 82.39%.
机译:图像字幕是为图像生成文本描述的概念。它涉及机器学习技术(例如自然语言处理和计算机视觉)来为图像生成适当的描述。图像字幕在当今不断发展的数据世界中具有多种应用程序,例如应用程序推荐,虚拟协助,图像索引和社交媒体。图像字幕还可以帮助我们自动执行解释图像的工作,并向视障者描述视觉场景。图像字幕在驱动人机交互领域中是必不可少的。我们的研究论文提出了一个基于CNN-GRU的框架,用于使用图像和字幕的大型数据集进行训练并为新图像生成准确的字幕说明。基于描述建立照片标识符字典,以将这些描述转换为单词词汇并建立其列表。提出了VGG-16卷积神经网络作为我们的特征提取器,并提出了门控循环单元-循环神经网络作为我们的序列处理器。我们的模型为我们提供了82.39%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号