首页> 外文会议>International Conference on Universal Digital Library >Semantic Understanding for Video Retrieval with Temporal Multimodal Fusion Analysis
【24h】

Semantic Understanding for Video Retrieval with Temporal Multimodal Fusion Analysis

机译:具有时间多模式融合分析的视频检索语义理解

获取原文

摘要

As the use of video increases in the digital library, offering effective retrieval service for video is a great demand. However, compared with text and images, video lacks of obvious semantics although video has rich multimodal information, such as text transcripts, visual/audio features and temporal structure. Therefore, understanding semantics embedded in video is necessary for video retrieval. In this paper, we propose a comprehensive approach to semantic understanding of video through automatic annotation with temporal multimodal fusion analysis. Various media aspects are investigated, including meaningful words and contextual distribution in the transcript, visual/audio features, and most importantly, the temporal interval relations involved in video. TFIDF retrieval method with score propagation is used to discover the association between a shot and its corresponding transcript. Experiments on the TRECVID 2003 dataset show that our approach achieves high performance.
机译:随着数字图书馆的视频增加,为视频提供有效的检索服务是一个很大的需求。但是,与文本和图像相比,视频缺乏明显的语义,尽管视频具有丰富的多模式信息,例如文本抄本,可视/音频功能和时间结构。因此,嵌入在视频中的理解语义是视频检索所必需的。在本文中,我们提出了一种通过具有时间多模式融合分析的自动注释来实现对视频的语义理解的综合方法。调查各种媒体方面,包括在记录,视觉/音频功能中的有意义的单词和上下文分发,以及最重要的是,视频中涉及的时间间隔关系。具有分数传播的TFIDF检索方法用于发现拍摄和其相应的转录之间的关联。 TRECVID 2003数据集的实验表明,我们的方法实现了高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号