A Novel Convolutional Neural Network-Gated Recurrent Unit approach for Image Captioning

机译：一种新颖的卷积神经网络门控递归单元图像字幕

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image captioning is a concept of generating a textual description for an image. It involves Machine Learning techniques like Natural Language Processing and Computer Vision to produce appropriate descriptions for images. Image Captioning has several applications in today's world of ever-expanding data such as Application Recommendation, Virtual Assistance, Image Indexing, and in Social Media. Image captioning can also help us in automating the job of interpreting images and in describing a visual scene to the visually impaired. Image Captioning has been dispensable in driving the Human-Computer Interaction field. Our Research paper proposes a CNN-GRU based framework for training using large datasets of Images and Captions and generating accurate caption descriptions for new images. A dictionary of photo identifiers is built based on descriptions to convert these descriptions into a vocabulary of words and built their list. A VGG-16 Convolution Neural Network has been proposed as our feature extractor and a Gated Recurrent Unit - Recurrent Neural Network as our Sequence Processor. Our model gives us an accuracy of 82.39%.

机译：图像字幕是为图像生成文本描述的概念。它涉及机器学习技术（例如自然语言处理和计算机视觉）来为图像生成适当的描述。图像字幕在当今不断发展的数据世界中具有多种应用程序，例如应用程序推荐，虚拟协助，图像索引和社交媒体。图像字幕还可以帮助我们自动执行解释图像的工作，并向视障者描述视觉场景。图像字幕在驱动人机交互领域中是必不可少的。我们的研究论文提出了一个基于CNN-GRU的框架，用于使用图像和字幕的大型数据集进行训练并为新图像生成准确的字幕说明。基于描述建立照片标识符字典，以将这些描述转换为单词词汇并建立其列表。提出了VGG-16卷积神经网络作为我们的特征提取器，并提出了门控循环单元-循环神经网络作为我们的序列处理器。我们的模型为我们提供了82.39％的准确性。

著录项

来源
《International Conference on Smart Systems and Inventive Technology》|2020年|704-708|共5页
会议地点
作者
Sarthak Singh Rawat; Kartikeyan Singh Rawat; Rahul Nijhawan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Logic gates; Feature extraction; Computational modeling; Recurrent neural networks; Convolutional neural networks; Training; Computer architecture;

机译：逻辑门;特征提取;计算建模;递归神经网络;卷积神经网络;训练;计算机体系结构;

相似文献

外文文献
中文文献
专利

1. Survey of convolutional neural networks for image captioning [J] . Kalra Saloni, Leekha Alka Journal of information and optimization sciences . 2020,第1期

机译：图像标题卷积神经网络调查
2. Reference-based model using multimodal gated recurrent units for image captioning [J] . Tiago do Carmo Nogueira, Cassio Dener Noronha Vinhal, Gelson da Cruz Junior, Multimedia Tools and Applications . 2020,第41a42期

机译：基于参考的模型，使用多模式门控复发单元进行图像标题
3. Evolutionary recurrent neural network for image captioning [J] . Wang Hanzhang, Wang Hanli, Xu Kaisheng Neurocomputing . 2020,第Auga11期

机译：用于图像标题的进化复发性神经网络
4. Image Captioning using Convolutional Neural Networks and Recurrent Neural Network [C] . Rachel Calvin, Shravya Suresh International Conference for Convergence in Technology . 2021

机译：使用卷积神经网络和经常性神经网络的图像标题
5. Image Series Prediction via Convolutional Recurrent Neural Networks with Limited Training Data [D] . Zhang, Zao 2018

机译：通过卷积经常性神经网络的图像系列预测有限训练数据
6. A Novel Fault Diagnosis Approach for Chillers Based on 1-D Convolutional Neural Network and Gated Recurrent Unit [O] . Zhuozheng Wang, Yingjie Dong, Wei Liu, 2020

机译：基于一维卷积神经网络和门控循环单元的冷水机组故障诊断新方法
7. CricShotClassify: An Approach to Classifying Batting Shots from Cricket Videos Using a Convolutional Neural Network and Gated Recurrent Unit [O] . Anik Sen, Kaushik Deb, Pranab Kumar Dhar, 2021

机译：Cricshotclassify：使用卷积神经网络和门控复发单元对板球视频进行分类拍摄的方法
8. Symmetric Convolution. Using Unitary Transform Matrices: A New Approach to Image Reconstruction [R] . Foltz, T. M. 1999

机译：对称卷积。使用酉变换矩阵：一种新的图像重建方法

A Novel Convolutional Neural Network-Gated Recurrent Unit approach for Image Captioning

摘要

著录项

相似文献

相关主题

期刊订阅