【24h】

Combining Contextual and Modal Action Information into a Weighted Multikernel SVM for Human Action Recognition

机译:将上下文和模态动作信息组合成用于人类行动识别的加权多时期SVM

获取原文
获取外文期刊封面目录资料

摘要

Understanding human activities is one of the most challenging modern topics for robots. Either for imitation or anticipation, robots must recognize which action is performed by humans when they operate in a human environment. Action classification using a Bag of Words (BoW) representation has shown computational simplicity and good performance, but the increasing number of categories, including actions with high confusion, and the addition, especially in human robot interactions, of significant contextual and multimodal information has led most authors to focus their efforts on the combination of image descriptors. In this field, we propose the Contextual and Modal MultiKernel Learning Support Vector Machine (CMMKL-SVM). We introduce contextual information -objects directly related to the performed action by calculating the codebook from a set of points belonging to objects- and multimodal information -features from depth and 3D images resulting in a set of two extra modalities of information in addition to RGB images-. We code the action videos using a BoW representation with both contextual and modal information and introduce them to the optimal SVM kernel as a linear combination of single kernels weighted by learning. Experiments have been carried out on two action databases, CAD-120 and HMDB. The upturn achieved with our approach attained the same results for high constrained databases with respect to other similar approaches of the state of the art and it is much better as much realistic is the database, reaching a performance improvement of 14.27% for HMDB.
机译:了解人类活动是机器人最具挑战性的最具挑战性的主题之一。无论是为了模仿还是预期,机器人必须识别人类在人类环境中运行时由人类执行的行动。使用一袋单词(弓)表示的行动分类已经显示了计算简单和良好的性能,但越来越多的类别,包括高困难的动作,以及尤其是人体机器人交互,具有重要的语境和多模式信息的增加大多数作者将其努力集中在图像描述符的组合上。在此字段中,我们提出了上下文和模态多时期学习支持向量机(CMMKL-SVM)。我们通过从属于对象和多模式信息的一组点来计算码本 - 从深度和3D图像计算,通过从深度和3D图像计算,通过从深度和3D图像进行分析,除了RGB图像之外,通过从深度和3D图像计算,通过从深度和3D图像计算,通过从深度和3D图像进行一次额外的信息的一组额外方式的额外方式来介绍与执行的操作直接相关的 - 。我们使用与上下文和模态信息一起使用弓形表示来编码动作视频,并将其介绍到最佳SVM内核作为通过学习加权的单个内核的线性组合。实验已经在两个动作数据库,CAD-120和HMDB上进行。通过我们的方法实现的高度达到了相同的高度约束数据库的结果,关于现有技术的其他类似方法,并且可以更好的现实是数据库,达到HMDB的性能提高14.27%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号