首页> 外国专利> A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING

A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING

机译:一种使用图像调节屏蔽语言建模训练用于图像识别的卷积神经网络的方法

摘要

A method and system pre-trains a convolutional neural network for image recognition based upon masked language modeling by inputting, to the convolutional neural network, an image; outputting, from the convolutional neural network, a visual embedding tensor of visual embedding vectors; tokenizing a caption to create a list of tokens, at least one token having visual correspondence to the image received by the convolutional neural network; randomly selecting one of the tokens in the list of tokens to be masked, the selected token being taken as ground truth; computing, using a language model neural network, hidden representations of the tokens; using the hidden representation of the masked token, as a query vector, to pool the visual embedding vectors in the visual embedding tensor, attentively; predicting the masked token by mapping the pooled visual embedding vectors to the tokens; determining a prediction loss associated with the masked token; and back-propagating the prediction loss to the convolutional neural network to tune parameters thereof.
机译:一种基于通过输入的屏蔽语言建模,对卷积神经网络,图像进行预调速神经网络的方法和系统,用于基于屏蔽语言建模的图像识别;从卷积神经网络输出,视觉嵌入向量的视觉嵌入张量;授权制作标题以创建令牌列表,至少一个具有与卷积神经网络接收的图像的视觉对应的一个令牌;随机选择要屏蔽的令牌列表中的一个令牌,所选令牌被视为地面真相;计算,使用语言模型神经网络,令牌的隐藏表示;使用蒙版令牌的隐藏表示作为查询向量,巩固视觉嵌入卷的视觉嵌入向量,术神;通过将汇集的视觉嵌入向量映射到令牌来预测屏蔽令牌;确定与掩蔽令牌相关的预测损失;并将预测丢失的回到卷积神经网络以调谐其参数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号