首页> 外国专利> Systems and methods for video paragraph captioning using hierarchical recurrent neural networks

Systems and methods for video paragraph captioning using hierarchical recurrent neural networks

机译:使用分层递归神经网络进行视频段落字幕的系统和方法

摘要

Described herein are systems and methods that exploit hierarchical Recurrent Neural Networks (RNNs) to tackle the video captioning problem; that is, generating one or multiple sentences to describe a realistic video. Embodiments of the hierarchical framework comprise a sentence generator and a paragraph generator. In embodiments, the sentence generator produces one simple short sentence that describes a specific short video interval. In embodiments, it exploits both temporal- and spatial-attention mechanisms to selectively focus on visual elements during generation. In embodiments, the paragraph generator captures the inter-sentence dependency by taking as input the sentential embedding produced by the sentence generator, combining it with the paragraph history, and outputting the new initial state for the sentence generator.
机译:本文描述了利用分级递归神经网络(RNN)来解决视频字幕问题的系统和方法。也就是说,生成一个或多个句子来描述真实的视频。分层框架的实施例包括句子生成器和段落生成器。在实施例中,句子生成器产生一个描述特定短视频间隔的简单短句子。在实施例中,它利用时间和空间注意力机制在生成期间选择性地关注视觉元素。在实施例中,段落生成器通过将由句子生成器产生的句子嵌入作为输入,将其与段落历史结合,并为句子生成器输出新的初始状态,来捕获句间依赖性。

著录项

  • 公开/公告号US10395118B2

    专利类型

  • 公开/公告日2019-08-27

    原文格式PDF

  • 申请/专利权人 BAIDU USA LLC;

    申请/专利号US201615183678

  • 申请日2016-06-15

  • 分类号G06N3/04;G06K9;G06N3/08;

  • 国家 US

  • 入库时间 2022-08-21 12:14:43

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号