首页> 外文期刊>IEEE transactions on multimedia >A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities
【24h】

A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities

机译:基于挖掘概念实体之间的故事结构和语义关系的视频摘要

获取原文
获取原文并翻译 | 示例
       

摘要

Video summarization techniques have been proposed for years to offer people comprehensive understanding of the whole story in the video. Roughly speaking, existing approaches can be classified into the two types: one is static storyboard, and the other is dynamic skimming. However, despite that these traditional methods give brief summaries for users, they still do not provide with a concept-organized and systematic view. In this paper, we present a structural video content browsing system and a novel summarization method by utilizing the four kinds of entities: who, what, where, and when to establish the framework of the video contents. With the assistance of the above-mentioned indexed information, the structure of the story can be built up according to the characters, the things, the places, and the time. Therefore, users can not only browse the video efficiently but also focus on what they are interested in via the browsing interface. In order to construct the fundamental system, we employ maximum entropy criterion to integrate visual and text features extracted from video frames and speech transcripts, generating high-level concept entities. A novel concept expansion method is introduced to explore the associations among these entities. After constructing the relational graph, we exploit graph entropy model to detect meaningful shots and relations, which serve as the indices for users. The results demonstrate that our system can achieve better performance and information coverage.
机译:多年来,已经提出了视频摘要技术,以使人们对视频中的整个故事有全面的了解。粗略地讲,现有方法可以分为两种:一种是静态情节提要,另一种是动态略读。但是,尽管这些传统方法为用户提供了简短的摘要,但它们仍然没有提供概念组织和系统的视图。在本文中,我们通过利用四种实体:谁,什么,在哪里以及何时建立视频内容的框架,提出了一种结构化的视频内容浏览系统和一种新颖的汇总方法。借助上述索引信息,可以根据人物,事物,地点和时间来构建故事的结构。因此,用户不仅可以有效地浏览视频,还可以通过浏览界面专注于他们感兴趣的内容。为了构建基本系统,我们采用最大熵准则来整合从视频帧和语音记录中提取的视觉和文本特征,从而生成高级概念实体。引入了一种新颖的概念扩展方法来探索这些实体之间的关联。在构造关系图之后,我们利用图熵模型检测有意义的镜头和关系,作为用户的指标。结果表明,我们的系统可以实现更好的性能和信息覆盖率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号