LoL-V2T: Large-Scale Esports Video Description Dataset

机译：LOL-V2T：大规模的电子竞技视频描述数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Esports is a fastest-growing new field with a largely online-presence, and is creating a demand for automatic domain-specific captioning tools. However, at the current time, there are few approaches that tackle the esports video description problem. In this work, we propose a large-scale dataset for esports video description, focusing on the popular game "League of Legends". The dataset, which we call LoL-V2T, is the largest video description dataset in the video game domain, and includes 9,723 clips with 62,677 captions. This new dataset presents multiple new video captioning challenges such as large amounts of domain-specific vocabulary, subtle motions with large importance, and a temporal gap between most captions and the events that occurred. In order to tackle the issue of vocabulary, we propose a masking the domain-specific words and provide additional annotations for this. In our results, we show that the dataset poses a challenge to existing video captioning approaches, and the masking can significantly improve performance. Our dataset and code is publicly available^1.

机译：Esports是一个最快的新字段，主要是在线存在，并且正在为特定于自动域的标题工具创造一个需求。但是，在当前时，几乎没有解决蚀地点视频描述问题的方法。在这项工作中，我们提出了一个大规模的数据集，用于esports视频描述，重点关注流行的游戏“联盟”。我们调用lol-v2t的数据集是视频游戏域中最大的视频描述数据集，包括9,723个剪辑，其中包含62,677个标题。这个新数据集具有多个新的视频字幕挑战，例如大量的域特定词汇，具有很大的微妙动作，以及大多数标题之间的时间间隙和发生的事件。为了解决词汇问题，我们提出了一个掩蔽了域特定的单词并为此提供额外的注释。在我们的结果中，我们表明数据集对现有视频字幕方法构成挑战，并且屏蔽可以显着提高性能。我们的数据集和代码是公开可用的^{1 。}

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops》|2021年|4552-4561|共10页
会议地点
作者
Tsunehiko Tanaka; Edgar Simo-Serra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Vocabulary; Computer vision; Video description; Conferences; Focusing; Games;

机译：培训;词汇;计算机愿景;视频描述;会议;聚焦;游戏;

相似文献

外文文献
中文文献
专利

1. On-the-fly learning for visual search of large-scale image and video datasets [J] . Ken Chatfield, Relja Arandjelovi?, Omkar Parkhi, International Journal of Multimedia Information Retrieval . 2015,第2期

机译：动态学习，用于可视化大规模图像和视频数据集
2. On-the-fly learning for visual search of large-scale image and video datasets [J] . Ken Chatfield, Relja Arandjelović, Omkar Parkhi, International Journal of Multimedia Information Retrieval . 2015,第2期

机译：动态学习，用于可视化大规模图像和视频数据集
3. Face Retrieval in Large-Scale News Video Datasets [J] . Thanh Duc NGO, Hung Thanh VU, Duy-Dinh LE, IEICE transactions on information and systems . 2013,第8期

机译：大规模新闻视频数据集中的人脸检索
4. End-to-End Learning of Driving Models from Large-Scale Video Datasets [C] . Huazhe Xu, Yang Gao, Fisher Yu, IEEE Conference on Computer Vision and Pattern Recognition . 2017

机译：从大规模视频数据集中进行驾驶模型的端到端学习
5. Analysis of Large-Scale Human Genetic Datasets to Identify Novel Risk Factors and Therapeutic Targets for Cardiometabolic Disease [D] . ?Emdin, Connor 2020

机译：大规模人类遗传数据集分析，以识别心细素疾病的新危险因素和治疗靶标
6. On-the-fly learning for visual search of large-scale image and video datasets [O] . Ken Chatfield, Relja Arandjelović, Omkar Parkhi, -1

机译：动态学习用于可视化大规模图像和视频数据集
7. Stochastic Non-linear Hashing for Near-Duplicate Video Retrieval using Deep Feature applicable to Large-scale Datasets [O] . 2019

机译：用于近复制视频检索的随机非线性散列使用适用于大型数据集的深度特征
8. Large-scale Benchmark Dataset for Event Recognition in Surveillance Video [R] . Oh, S., Hoogs, A., Perera, A., 2011

机译：用于监视视频中事件识别的大规模基准数据集

LoL-V2T: Large-Scale Esports Video Description Dataset

摘要

著录项

相似文献

相关主题

期刊订阅