Video Caption Dataset for Describing Human Actions in Japanese

机译：用于描述日语的人类行为的视频标题数据集

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In recent years, automatic video caption generation has attracted considerable attention. This paper focuses on the generation of Japanese captions for describing human actions. While most currently available video caption datasets have been constructed for English, there is no equivalent Japanese dataset. To address this, we constructed a large-scale Japanese video caption dataset consisting of 79,822 videos and 399,233 captions. Each caption in our dataset describes a video in the form of "who does what and where." To describe human actions, it is important to identify the details of a person, place, and action. Indeed, when we describe human actions, we usually mention the scene, person, and action. In our experiments, we evaluated two caption generation methods to obtain benchmark results. Further, we investigated whether those generation methods could specify "who does what and where."

机译：近年来，自动视频标题一代引起了相当大的关注。本文重点介绍日语标题以描述人类行为。虽然大多数当前可用的视频标题数据集已为英语构建，但没有等效的日本数据集。为了解决这个问题，我们构建了一个由79,822个视频和399,233个标题组成的大型日语视频字幕数据集。我们的数据集中的每个标题描述了“谁做了什么和在哪里”的视频。为了描述人类的行为，重要的是确定一个人，地方和行动的细节。实际上，当我们描述人类行为时，我们通常会提到现场，人和行动。在我们的实验中，我们评估了两个标题生成方法以获得基准结果。此外，我们调查了这些生成方法是否可以指定“谁做的是什么和地点”。

著录项

来源
《International Conference on Language Resources and Evaluation》|2020年|4664-4670|共7页
会议地点
作者
Yutaro Shigeto; Yuya Yoshikawa; Jiaqing Lin; Akikazu Takeuchi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
video captioning; caption generation; Japanese caption dataset; human action understanding;

机译：视频标题;标题生成;日本标题数据集;人类的行动理解;

相似文献

外文文献
中文文献
专利

1. Describing Trajectory of Surface Patch for Human Action Recognition on RGB and Depth Videos [J] . Song Y., Liu S., Tang J. Signal Processing Letters, IEEE . 2015,第4期

机译：描述用于RGB和深度视频的人类动作识别的表面补丁轨迹
2. M-VAD names: a dataset for video captioning with naming [J] . Pini Stefano, Cornia Marcella, Bolelli Federico, Multimedia Tools and Applications . 2019,第10期

机译：M-VAD名称：用于命名视频字幕的数据集
3. M-VAD names: a dataset for video captioning with naming [J] . Pini Stefano, Cornia Marcella, Bolelli Federico, Multimedia Tools and Applications . 2019,第10期

机译：M-VAD名称：具有命名的视频字幕的数据集
4. STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset [C] . Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi Annual meeting of the Association for Computational Linguistics . 2017

机译：阶梯字幕：构建大型日语图像字幕数据集
5. Reciprocity in Online Social Interactions: Three Longitudinal Case Studies of a Video-Mediated Japanese-English e-Tandem Exchange [D] . Akiyama, Yuka. 2018

机译：在线社交互动中的互惠性：视频中介的日英电子双人交流的三个纵向案例研究
6. An effective datasets describing antimicrobial peptide produced from Pediococcus acidilactici - purification and mode of action determined by molecular docking [O] . Ramachandran Chelliah, Kandasamy Saravanakumar, Eric Banan-Mwine Daliri, 2020

机译：描述由Pediococccus酸酐产生的抗微生物肽的有效数据集纯化和分子对接测定的作用方式
7. STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset [O] . Yoshikawa, Yuya, Shigeto, Yutaro, Takeuchi, Akikazu 2017

机译：sTaIR字幕：构建大型日文图像标题数据集

Video Caption Dataset for Describing Human Actions in Japanese

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅