首页> 外文会议>International conference on computational linguistics >Generating Video Description using Sequence-to-sequence Model with Temporal Attention
【24h】

Generating Video Description using Sequence-to-sequence Model with Temporal Attention

机译:使用具有时间关注的序列到序列模型生成视频描述

获取原文

摘要

Automatic video description generation has recently been getting attention after rapid advancement in image caption generation. Automatically generating description for a video is more challenging than for an image due to its temporal dynamics of frames. Most of the work relied on Recurrent Neural Network (RNN) and recently atlentional mechanisms have also been applied to make the model learn to focus on some frames of the video while generating each word in a describing sentence. In this paper, we focus on a sequence-to-sequence approach with temporal attention mechanism. We analyze and compare the results from different attention model configuration. By applying the temporal attention mechanism to the system, we can achieve a METEOR score of 0.310 on Microsoft Video Description dataset, which outperformed the state-of-the-art system so far.
机译:在图像标题一代的快速进步后,自动视频描述最近一直受到关注。由于其帧的时间动态,自动生成视频的描述比图像更具挑战性。大多数工作依赖于经常性神经网络(RNN)和最近的省略机制也已应用于使模型学会专注于视频的一些帧,同时在描述句子中生成每个单词。在本文中,我们专注于具有时间关注机制的序列到序列方法。我们分析并与不同关注模型配置的结果进行比较。通过将时间关注机制应用于系统,我们可以在Microsoft视频描述数据集中实现0.310的流入分数,这迄今为止优于最先进的系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号