首页> 外文会议>International conference on computational linguistics >Generating Video Description using Sequence-to-sequence Model with Temporal Attention
【24h】

Generating Video Description using Sequence-to-sequence Model with Temporal Attention

机译:使用具有时序注意的序列到序列模型生成视频描述

获取原文

摘要

Automatic video description generation has recently been getting attention after rapid advancement in image caption generation. Automatically generating description for a video is more challenging than for an image due to its temporal dynamics of frames. Most of the work relied on Recurrent Neural Network (RNN) and recently atlentional mechanisms have also been applied to make the model learn to focus on some frames of the video while generating each word in a describing sentence. In this paper, we focus on a sequence-to-sequence approach with temporal attention mechanism. We analyze and compare the results from different attention model configuration. By applying the temporal attention mechanism to the system, we can achieve a METEOR score of 0.310 on Microsoft Video Description dataset, which outperformed the state-of-the-art system so far.
机译:在图像字幕生成的快速发展之后,自动视频描述生成已引起了人们的关注。由于其帧的时间动态性,与图像相比,为视频自动生成描述更具挑战性。大部分工作都依赖于递归神经网络(RNN),并且最近还采用了atlentional机制来使模型学会在描述句子中生成每个单词的同时专注于视频的某些帧。在本文中,我们关注具有时间注意机制的序列到序列方法。我们分析和比较来自不同注意力模型配置的结果。通过将时间注意机制应用于该系统,我们可以在Microsoft Video Description数据集上获得0.310的METEOR得分,其性能优于迄今为止的最新系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号