Multimodal Abstractive Summarization for How2 Videos

机译：How2视频的多模式抽象总结

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we study abstractive summarization for open-domain videos. Unlike the traditional text news summarization, the goal is less to "compress" text information but rather to provide a fluent textual summary of information that has been collected and fused from different source modalities, in our case video and audio transcripts (or text). We show how a multi-source sequence-to-sequence model with hierarchical attention can integrate information from different modalities into a coherent output, compare various models trained with different modalities and present pilot experiments on the How2 corpus of instructional videos. We also propose a new evaluation metric (Content F1) for abstractive summarization task that measures semantic adequacy rather than fluency of the summaries, which is covered by metrics like ROUGE and BLEU.

机译：在本文中，我们研究了开放域视频的抽象总结。与传统的文本新闻摘要不同，目标不是“压缩”文本信息，而是提供流利的文本摘要信息，这些信息是从不同的来源模式（在我们的情况下是视频和音频转录本（或文本））中收集和融合的。我们展示了具有层次结构注意力的多源序列到序列模型如何将来自不同模态的信息集成到一个一致的输出中，比较以不同模态训练的各种模型，以及如何在教学视频的How2语料库上进行试点实验。我们还为抽象摘要任务提出了一种新的评估指标（内容F1），该指标用于衡量摘要的语义适当性而不是流畅度，而ROUGE和BLEU等指标则涵盖了该指标。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|6587-6596|共10页
会议地点
作者
Shruti Palaskar; Jindrich Libovicky; Spandana Gella; Florian Metze;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. D-MmT: A concise decoder-only multi-modal transformer for abstractive summarization in videos [J] . Liu Nayu, Sun Xian, Yu Hongfeng, Neurocomputing . 2021,第Octa7期

机译：D-MMT：仅用于视频中的抽象摘要的简洁解码器的多模态变压器
2. Video Summarization Based on Multimodal Features [J] . Zhang Yu, Liu Ju, Liu Xiaoxi, International journal of multimedia data engineering & management . 2020,第4期

机译：基于多模式特征的视频汇总
3. COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization [J] . Athanasia Zlatintsi, Petros Koutras, Georgios Evangelopoulos, EURASIP journal on image and video processing . 2017,第1期

机译：COGNIMUSE：一个多模态视频数据库，带有显着性，事件，语义和情感注释，并应用于摘要
4. Multimodal Abstractive Summarization for How2 Videos [C] . Shruti Palaskar, Jindrich Libovicky, Spandana Gella, Annual meeting of the Association for Computational Linguistics . 2019

机译：HOW2视频的多式联抽象摘要
5. Optimization-based summarization and indexing of extended videos, with application to instructional video semantics. [D] . Liu, Tiecheng. 2003

机译：基于优化的扩展视频摘要和索引，应用于教学视频语义。
6. Visual saliency models for summarization of diagnostic hysteroscopy videos in healthcare systems [O] . Khan Muhammad, Jamil Ahmad, Muhammad Sajjad, -1

机译：可视显着性模型用于汇总医疗保健系统中的宫腔镜诊断视频
7. Multimodal Abstractive Summarization for How2 Videos [O] . Shruti Palaskar, Jindřich Libovický, Spandana Gella, 2019

机译：HOW2视频的多式联抽象摘要

Multimodal Abstractive Summarization for How2 Videos

摘要

著录项

相似文献

相关主题

期刊订阅