...
首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >A Super Descriptor Tensor Decomposition for Dynamic Scene Recognition
【24h】

A Super Descriptor Tensor Decomposition for Dynamic Scene Recognition

机译:动态场景识别的超级描述符张量分解

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a new approach for dynamic scene recognition based on a super descriptor tensor decomposition. Recently, local feature extraction based on dense trajectories has been used for modeling motion. However, dense trajectories usually include a large number of unnecessary trajectories, which increase noise, add complexity, and limit the recognition accuracy. Another problem is that the traditional bag-of-words techniques encode and concatenate the local features extracted from multiple descriptors to form a single large vector for classification. This concatenation not only destroys the spatio-temporal structure among the features but also yields high dimensionality. To address these problems, first, we propose to refine the dense trajectories by selecting only salient trajectories in a region of interest containing motion. Visual descriptors consisting of oriented gradient and motion boundary histograms are then computed along the refined dense trajectories. In case of camera motion, a short-window video stabilization is integrated to compensate for global motion. Second, the extracted features from multiple descriptors are encoded using a super descriptor tensor model. To this end, the TUCKER-3 tensor decomposition is employed to obtain a compact set of salient features, followed by feature selection via Fisher ranking. Experiments are conducted using two benchmark dynamic scene recognition datasets: Maryland "in-the-wild" and YUPPEN dynamic scenes. Experimental results show that the proposed approach outperforms several existing methods in terms of recognition accuracy and achieves a performance comparable with the state-of-the-art deep learning methods. The proposed approach achieves classification rates of 89.2% for Maryland and 98.1% for YUPPEN datasets.
机译:本文提出了一种基于超描述符张量分解的动态场景识别新方法。最近,基于密集轨迹的局部特征提取已用于对运动进行建模。但是,密集的轨迹通常包括大量不必要的轨迹,这会增加噪声,增加复杂度并限制识别精度。另一个问题是,传统的词袋技术对从多个描述符中提取的局部特征进行编码和连接,以形成用于分类的单个大向量。这种级联不仅破坏了特征之间的时空结构,而且产生了高维数。为了解决这些问题,首先,我们建议通过仅在包含运动的感兴趣区域中选择显着轨迹来细化密集轨迹。然后沿着精确的密集轨迹计算由定向梯度和运动边界直方图组成的视觉描述符。在摄像机运动的情况下,集成了一个短窗口视频稳定功能以补偿全局运动。其次,使用超级描述符张量模型对从多个描述符中提取的特征进行编码。为此,采用TUCKER-3张量分解来获得一组紧凑的显着特征,然后通过Fisher等级进行特征选择。实验使用两个基准动态场景识别数据集进行:马里兰州“荒野”和YUPPEN动态场景。实验结果表明,该方法在识别准确度方面优于几种现有方法,并具有与最新的深度学习方法相当的性能。所提出的方法对马里兰州的分类率为89.2%,对于YUPPEN数据集的分类率为98.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号