首页> 外文会议>IEEE Symposium on Computational Intelligence for Multimedia, Signal and Vision Processing >Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians
【24h】

Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians

机译:混合使用非对称高斯函数来改进用于动作识别的密码本生成

获取原文

摘要

Human activity recognition is a crucial area of computer vision research and applications. The goal of human activity recognition aims to automatically analyze and interpret ongoing events and their context from video data. Recently, the bag of visual words (BoVW) approach has been widely applied for human action recognition. Generally, a representative corpus of videos is used to build the Visual Words dictionary or codebook using a simple k-means clustering approach. This visual dictionary is then used to quantize the extracted features by simply assigning the label of the closest cluster centroid using Euclidean distance between the cluster centers and the input descriptor. Thus, each video can be represented as a frequency histogram over visual words. However, the BoVW approach has several limitations such as its need for a predefined codebook size, dependence on the chosen set of visual words, and the use of hard assignment clustering for histogram creation. In this paper, we are trying to overcome these issues by using a mixture of Asymmetric Gaussians to build the codebook. Our method is able to identify the best size for our dictionary in an unsupervised manner, to represent the set of input feature vectors by an estimate of their density distribution, and to allow soft assignments. Furthermore, we validate the efficiency of the proposed algorithm for human action recognition.
机译:人类活动识别是计算机视觉研究和应用的关键领域。人类活动识别的目标是自动从视频数据中分析和解释进行中的事件及其背景。最近,视觉单词袋(BoVW)方法已被广泛应用于人类动作识别。通常,具有代表性的视频语料库通过简单的k均值聚类方法用于构建“视觉单词”词典或密码本。然后,通过使用聚类中心和输入描述符之间的欧几里得距离,简单地分配最接近的聚类质心的标签,即可使用该可视词典对提取的特征进行量化。因此,每个视频可以表示为视觉词上的频率直方图。但是,BoVW方法具有一些局限性,例如需要预定义的码本大小,对所选可视单词集的依赖以及使用硬分配聚类来创建直方图。在本文中,我们试图通过使用非对称高斯混合体来构建码本来克服这些问题。我们的方法能够以无监督的方式为字典确定最佳大小,通过估计其密度分布来表示输入特征向量的集合,并允许进行软分配。此外,我们验证了所提出算法对人类动作识别的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号