Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians

机译：混合使用非对称高斯函数来改进用于动作识别的密码本生成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Human activity recognition is a crucial area of computer vision research and applications. The goal of human activity recognition aims to automatically analyze and interpret ongoing events and their context from video data. Recently, the bag of visual words (BoVW) approach has been widely applied for human action recognition. Generally, a representative corpus of videos is used to build the Visual Words dictionary or codebook using a simple k-means clustering approach. This visual dictionary is then used to quantize the extracted features by simply assigning the label of the closest cluster centroid using Euclidean distance between the cluster centers and the input descriptor. Thus, each video can be represented as a frequency histogram over visual words. However, the BoVW approach has several limitations such as its need for a predefined codebook size, dependence on the chosen set of visual words, and the use of hard assignment clustering for histogram creation. In this paper, we are trying to overcome these issues by using a mixture of Asymmetric Gaussians to build the codebook. Our method is able to identify the best size for our dictionary in an unsupervised manner, to represent the set of input feature vectors by an estimate of their density distribution, and to allow soft assignments. Furthermore, we validate the efficiency of the proposed algorithm for human action recognition.

机译：人类活动识别是计算机视觉研究和应用的关键领域。人类活动识别的目标是自动从视频数据中分析和解释进行中的事件及其背景。最近，视觉单词袋（BoVW）方法已被广泛应用于人类动作识别。通常，具有代表性的视频语料库通过简单的k均值聚类方法用于构建“视觉单词”词典或密码本。然后，通过使用聚类中心和输入描述符之间的欧几里得距离，简单地分配最接近的聚类质心的标签，即可使用该可视词典对提取的特征进行量化。因此，每个视频可以表示为视觉词上的频率直方图。但是，BoVW方法具有一些局限性，例如需要预定义的码本大小，对所选可视单词集的依赖以及使用硬分配聚类来创建直方图。在本文中，我们试图通过使用非对称高斯混合体来构建码本来克服这些问题。我们的方法能够以无监督的方式为字典确定最佳大小，通过估计其密度分布来表示输入特征向量的集合，并允许进行软分配。此外，我们验证了所提出算法对人类动作识别的效率。

著录项

来源
《IEEE Symposium on Computational Intelligence for Multimedia, Signal and Vision Processing》|2014年|1-7|共7页
会议地点
作者
Elguebaly Tarek; Bouguila Nizar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Gaussian processes; image motion analysis; image recognition; image representation; mixture models; video coding; BoVW approach; asymmetric Gaussian mixture; bag of visual words; codebook generation improvement; density distribution; feature vector representation; human action recognition; soft assignments; video representation; visual words dictionary; Detectors; Dictionaries; Feature extraction; Hidden Markov models; Histograms; Vectors; Visualization; Gaussian mixture; Unsupervised learning; expectation-maximization; human action recognition;

机译：高斯过程;图像运动分析;图像识别;图像表示;混合物模型;视频编码; BoVW方法;非对称高斯混合;视觉词袋;码本生成改进;密度分布;特征向量表示;人体动作识别;软分配;视频表示法;视觉词典;检测器;词典;特征提取;隐马尔可夫模型;直方图;向量;可视化;高斯混合;无监督学习;期望最大化;人类动作识别;

相似文献

外文文献
中文文献
专利

1. Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition [J] . Zhang, C., Liu, Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第10期

机译：具有区分性动态高斯混合选择的可靠口音特定单元生成，用于多口音中文语音识别
2. An Automatic Recognition Method of Microseismic Signals Based on S Transformation and Improved Gaussian Mixture Model [J] . Kaikai Wang, Chun’an Tang, Ke Ma, Advances in civil engineering . 2020,第1期

机译：基于S变换和改进的高斯混合模型的微震信号自动识别方法
3. Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals [J] . Muthusamy Hariharan, Polat Kemal, Yaacob Sazali Mathematical Problems in Engineering . 2015,第pta3期

机译：使用高斯混合模型和极限学习机改进语音和声门信号中的情绪识别
4. Bayesian Learning of Infinite Asymmetric Gaussian Mixture Models for Background Subtraction [C] . Ziyang Song, Samr Ali, Nizar Bouguila International conference on image analysis and recognition . 2019

机译：无限不对称高斯混合模型的背景减数贝叶斯学习
5. Mixtures of inverse covariances: Covariance modeling for Gaussian mixtures with applications to automatic speech recognition. [D] . Vanhoucke, Vincent. 2004

机译：逆协方差的混合：高斯混合的协方差建模及其在自动语音识别中的应用。
6. An Improved Mixture-of-Gaussians Background Model with Frame Difference and Blob Tracking in Video Stream [O] . Li Yao, Miaogen Ling -1

机译：视频流中具有帧差异和斑点跟踪的改进的高斯混合背景模型
7. An Automatic Recognition Method of Microseismic Signals Based on S Transformation and Improved Gaussian Mixture Model [O] . Kaikai Wang, Chun’an Tang, Ke Ma, 2020

机译：基于S变换和改进的高斯混合模型的微震信号自动识别方法

Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians

摘要

著录项

相似文献

相关主题

期刊订阅