Generating Video Description using Sequence-to-sequence Model with Temporal Attention

机译：使用具有时序注意的序列到序列模型生成视频描述

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic video description generation has recently been getting attention after rapid advancement in image caption generation. Automatically generating description for a video is more challenging than for an image due to its temporal dynamics of frames. Most of the work relied on Recurrent Neural Network (RNN) and recently atlentional mechanisms have also been applied to make the model learn to focus on some frames of the video while generating each word in a describing sentence. In this paper, we focus on a sequence-to-sequence approach with temporal attention mechanism. We analyze and compare the results from different attention model configuration. By applying the temporal attention mechanism to the system, we can achieve a METEOR score of 0.310 on Microsoft Video Description dataset, which outperformed the state-of-the-art system so far.

机译：在图像字幕生成的快速发展之后，自动视频描述生成已引起了人们的关注。由于其帧的时间动态性，与图像相比，为视频自动生成描述更具挑战性。大部分工作都依赖于递归神经网络（RNN），并且最近还采用了atlentional机制来使模型学会在描述句子中生成每个单词的同时专注于视频的某些帧。在本文中，我们关注具有时间注意机制的序列到序列方法。我们分析和比较来自不同注意力模型配置的结果。通过将时间注意机制应用于该系统，我们可以在Microsoft Video Description数据集上获得0.310的METEOR得分，其性能优于迄今为止的最新系统。

著录项

来源
《International conference on computational linguistics》|2016年|44-52|共9页
会议地点
作者
Natsuda Laokulrat; Sang Phan; Noriki Nishida; Raphael Shu; Yo Ehara; Naoaki Okazaki; Yusuke Miyao; Hideki Nakayama;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Multi-Scale Spatial-Temporal Attention Model for Person Re-Identification in Videos [J] . Zhang Wei, He Xuanyu, Yu Xiaodong, IEEE Transactions on Image Processing . 2020,第期

机译：视频重新识别的多尺度空间暂时注意模型
2. Placement Delivery Array Design via Attention-Based Sequence-to-Sequence Model With Deep Neural Network [J] . Zhang Zhengming, Hua Meng, Li Chunguo, Wireless Communications Letters, IEEE . 2019,第2期

机译：深度神经网络的基于注意力的序列到序列模型的布局传送阵列设计
3. Placement Delivery Array Design via Attention-Based Sequence-to-Sequence Model With Deep Neural Network [J] . Zhang Zhengming, Hua Meng, Li Chunguo, Wireless Communications Letters, IEEE . 2019,第2期

机译：通过基于关注的序列到序列模型进行放置交付阵列设计，具有深度神经网络
4. Generating Video Description using Sequence-to-sequence Model with Temporal Attention [C] . Natsuda Laokulrat, Sang Phan, Noriki Nishida, International conference on computational linguistics . 2016

机译：使用具有时间关注的序列到序列模型生成视频描述
5. Generating Temporal Action Proposals in Long Untrimmed Videos [D] . Vaishnavi, Pratik 2018

机译：在未修剪的长视频中生成时间动作建议
6. Spatio-Temporal Attention Model for Foreground Detection in Cross-Scene Surveillance Videos [O] . Dong Liang, Jiaxing Pan, Han Sun, 2019

机译：跨场景监控视频中前景检测的时空注意模型
7. Temporal Order and Pen Velocity Recovery for Character Handwriting Based on Sequence-to-Sequence with Attention Mode [O] . Besma Rabhi, Abdelkarim Elbaati, Houcine Boubaker, 2021

机译：基于带注意模式的序列与序列的字符手写的时间顺序和笔速度恢复

Generating Video Description using Sequence-to-sequence Model with Temporal Attention

摘要

著录项

相似文献

相关主题

期刊订阅