Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach

机译：使用命名的视频字幕：一种新颖的数据集和一种多模式方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current approaches for movie description lack the ability to name characters with their proper names, and can only indicate people with a generic "someone" tag. In this paper we present two contributions towards the development of video description architectures with naming capabilities: firstly, we collect and release an extension of the popular Montreal Video Annotation Dataset in which the visual appearance of each character is linked both through time and to textual mentions in captions. We annotate, in a semi-automatic manner, a total of 53k face tracks and 29k textual mentions on 92 movies. Moreover, to underline and quantify the challenges of the task of generating captions with names, we present different multi-modal approaches to solve the problem on already generated captions.

机译：当前用于电影描述的方法缺乏用适当名称来命名人物的能力，并且只能指示具有通用“某人”标签的人。在本文中，我们对具有命名功能的视频描述体系结构的发展做出了两个贡献：首先，我们收集并发布了流行的蒙特利尔视频注释数据集的扩展，其中每个字符的视觉外观都通过时间和文字提及而链接在一起。在字幕中。我们以半自动方式注释了92部电影中的53k面部轨迹和29k文字提示。此外，为了强调和量化使用名称生成字幕的任务所面临的挑战，我们提出了不同的多模式方法来解决已生成字幕的问题。

著录项

来源
《International conference on image analysis and processing》|2017年|384-395|共12页
会议地点
作者
Stefano Pini; Marcella Cornia; Lorenzo Baraldi; Rita Cucchiara;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Video captioning; Naming; Datasets; Deep learning;

机译：视频字幕;命名;数据集深度学习;

相似文献

外文文献
中文文献
专利

1. M-VAD names: a dataset for video captioning with naming [J] . Pini Stefano, Cornia Marcella, Bolelli Federico, Multimedia Tools and Applications . 2019,第10期

机译：M-VAD名称：用于命名视频字幕的数据集
2. M-VAD names: a dataset for video captioning with naming [J] . Pini Stefano, Cornia Marcella, Bolelli Federico, Multimedia Tools and Applications . 2019,第10期

机译：M-VAD名称：具有命名的视频字幕的数据集
3. An Adaptive Novel Approach for Detection of Text and Caption in Videos [J] . Bindhu. N, Bala Murugan. C International Journal of Engineering Research and Applications . 2013,第2期

机译：一种自适应的视频文本和字幕检测方法
4. Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach [C] . Stefano Pini, Marcella Cornia, Lorenzo Baraldi, International Conference on Image Analysis and Processing . 2017

机译：与命名的视频标题：一个小说数据集和多模态方法
5. Image Captioning: A Survey of Existing Issues on Datasets, Evaluation Metrics and Methods [D] . zhou, liwan . 2020

机译：图像字幕：对数据集的现有问题，评估度量和方法的调查
6. Harmonisation of variables names prior to conducting statistical analyses with multiple datasets: an automated approach [O] . Xavier Bosch-Capblanch 2011

机译：在对多个数据集进行统计分析之前统一变量名称：一种自动化方法
7. M-VAD names: a dataset for video captioning with naming [O] . Stefano Pini, Marcella Cornia, Federico Bolelli, 2018

机译：M-VAD名称：具有命名的视频字幕的数据集

Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach

摘要

著录项

相似文献

相关主题

期刊订阅