首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Content-based video parsing and indexing based on audio-visualinteraction
【24h】

Content-based video parsing and indexing based on audio-visualinteraction

机译:基于视听交互的基于内容的视频解析和索引

获取原文
获取原文并翻译 | 示例

摘要

A content-based video parsing and indexing method is presented in this paper, which analyzes both information sources (auditory and visual) and accounts for their inter-relations and synergy to extract high-level semantic information. Both frame- and object-based access to the visual information is employed. The aim of the method is to extract semantically meaningful video scenes and assign semantic label(s) to them. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video representations and indices are generated. The current approach searches for specific types of content information relevant to the presence or absence of speakers or persons. Audio-source parsing and indexing leads to the extraction of a speaker label mapping of the source over time. Video-source parsing and indexing results in the extraction of a talking-face shot mapping over time. Integration of the audio and visual mappings constrained by interaction rules leads to higher levels of video abstraction and even partial detection of its context
机译:本文提出了一种基于内容的视频解析和索引方法,该方法分析了信息源(听觉和视觉),并说明了它们之间的相互关系和协同作用,以提取高级语义信息。基于帧和基于对象的视觉信息访问都被采用。该方法的目的是提取语义上有意义的视频场景并为其分配语义标签。由于视频的时间特性,必须考虑时间。因此,产生了时间受限的视频表示和索引。当前方法搜索与说话者或人物的存在与否有关的特定类型的内容信息。音频源解析和索引会导致提取源的扬声器标签映射。视频源解析和索引会导致随着时间的推移提取说话人镜头的映射。受交互规则约束的音频和视觉映射的集成导致更高级别的视频抽象,甚至部分检测其上下文

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号