...
首页> 外文期刊>Multimedia Tools and Applications >Video retrieval of near-duplicates using k-nearest neighbor retrieval of spatio-temporal descriptors
【24h】

Video retrieval of near-duplicates using k-nearest neighbor retrieval of spatio-temporal descriptors

机译:使用时空描述符的k最近邻检索进行近乎重复的视频检索

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper describes a novel methodology for implementing video search functions such as retrieval of near-duplicate videos and recognition of actions in surveillance video. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with consistent color and motion properties are extracted from these 3D volumes by a threshold-free hierarchical space-time segmentation technique. Each region is then described by a high-dimensional point whose components represent the position, orientation and, when possible, color of the region. In the indexing phase for a video database, these points are assigned labels that specify their video clip of origin. All the labeled points for all the clips are stored into a single binary tree for efficient k-nearest neighbor retrieval. The retrieval phase uses video segments as queries. Half-second clips of these queries are again segmented by space-time segmentation to produce sets of points, and for each point the labels of its nearest neighbors are retrieved. The labels that receive the largest numbers of votes correspond to the database clips that are the most similar to the query video segment. We illustrate this approach for video indexing and retrieval and for action recognition. First, we describe retrieval experiments for dynamic logos, and for video queries that differ from the indexed broadcasts by the addition of large overlays. Then we describe experiments in which office actions (such as pulling and closing drawers, taking and storing items, picking up and putting down a phone) are recognized. Color information is ignored to insure independence of action recognition to people's appearance. One of the distinct advantages of using this approach for action recognition is that there is no need for detection or recognition of body parts.
机译:本文介绍了一种用于实现视频搜索功能的新颖方法,例如检索近重复的视频以及识别监视视频中的动作。视频被分为半秒的剪辑,其堆叠的帧产生3D时空像素。通过无阈值分层时空分割技术从这些3D体积中提取具有一致颜色和运动属性的像素区域。然后,每个区域都由一个高维点来描述,该高维点的成分代表该区域的位置,方向以及颜色(如果可能)。在视频数据库的索引阶段,为这些点分配了标签,这些标签指定了它们的视频原始片段。所有剪辑的所有标记点都存储在单个二叉树中,以进行有效的k最近邻检索。检索阶段将视频片段用作查询。这些查询的半秒片段再次通过时空分割进行分割,以生成点集,并为每个点检索其最近邻居的标签。获得最多票数的标签对应于与查询视频片段最相似的数据库剪辑。我们说明了这种用于视频索引和检索以及动作识别的方法。首先,我们描述了动态徽标的检索实验,以及通过添加大的覆盖层而不同于索引广播的视频查询的检索实验。然后,我们描述了可以识别办公动作(例如拉动和合上抽屉,拿起和存放物品,拿起和放下电话)的实验。颜色信息被忽略以确保动作识别与人的外观无关。使用这种方法进行动作识别的独特优势之一是不需要检测或识别身体部位。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号