Deep Multi-view Representation Learning for Video Anomaly Detection Using Spatiotemporal Autoencoders

Deepak K.; Srivathsan G.; Roshan S.; Chandrakala S.

首页> 外文期刊>Circuits, systems and signal processing >Deep Multi-view Representation Learning for Video Anomaly Detection Using Spatiotemporal Autoencoders

【24h】

Deep Multi-view Representation Learning for Video Anomaly Detection Using Spatiotemporal Autoencoders

机译：利用时空自动化器对视频异常检测进行深度多视图表示学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Visual perception is a transformative technology that can recognize patterns from environments through visual inputs. Automatic surveillance of human activities has gained significant importance in both public and private spaces. It is often difficult to understand the complex dynamics of events in real-time scenarios due to camera movements, cluttered backgrounds, and occlusion. Existing anomaly detection systems are not efficient because of high intra-class variations and inter-class similarities existing among activities. Hence, there is a demand to explore different kinds of information extracted from surveillance videos to improve overall performance. This can be achieved by learning features from multiple forms (views) of the given raw input data. We propose two novel methods based on the multi-view representation learning framework. The first approach is a hybrid multi-view representation learning that combines deep features extracted from 3D spatiotemporal autoencoder (3D-STAE) and robust handcrafted features based on spatiotemporal autocorrelation of gradients. The second approach is a deep multi-view representation learning that combines deep features extracted from two-stream STAEs to detect anomalies. Results on three standard benchmark datasets, namely Avenue, Live Videos, and BEHAVE, show that the proposed multi-view representations modeled with one-class SVM perform significantly better than most of the recent state-of-the-art methods.

机译：视觉感知是一种变换性技术，可以通过视觉输入识别来自环境的模式。人类活动的自动监测在公共场所和私人空间中取得了重要意义。由于相机运动，杂乱的背景和闭塞，通常难以了解实时场景中的事件的复杂动态。由于在活动中存在的高阶内变化和阶级相似性，现有的异常检测系统是不高效的。因此，需要探索从监视视频中提取的不同类型的信息，以提高整体性能。这可以通过从给定原始输入数据的多种形式（视图）的学习特征来实现。我们提出了一种基于多视图表示学习框架的两种新方法。第一种方法是混合多视图表示学习，它结合了从3D时空AutoEncoder（3D-STAE）提取的深度特征和基于Spatiotemporal自相关的梯度的强大手工制作功能。第二种方法是一种深的多视图表示学习，其结合了从两流状态提取的深度特征来检测异常。结果三个标准基准数据集，即大道，现场视频和行为，表明，用一流的SVM建模的建议的多视图表示明显优于最近最近的最新方法。

著录项

来源
《Circuits, systems and signal processing》 |2021年第3期|1333-1349|共17页
作者
Deepak K.; Srivathsan G.; Roshan S.; Chandrakala S.;
展开▼
作者单位

SASTRA Univ Sch Comp Intelligent Syst Grp Thanjavur 613401 India;

SASTRA Univ Sch Comp Intelligent Syst Grp Thanjavur 613401 India;

SASTRA Univ Sch Comp Intelligent Syst Grp Thanjavur 613401 India;

SASTRA Univ Sch Comp Intelligent Syst Grp Thanjavur 613401 India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Video anomaly detection; 3D spatiotemporal autoencoder; Multi-view representation learning; Spatiotemporal autocorrelation of gradients (STACOG); One-class SVM;

机译：视频异常检测;3D时空普通自动化器;多视图表示学习;梯度的时空自相关（Stacog）;单级SVM;

Deep Multi-view Representation Learning for Video Anomaly Detection Using Spatiotemporal Autoencoders

摘要

著录项

相关主题

期刊订阅