首页> 外文会议>International Conference on Pattern Recognition >From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification
【24h】

From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification

机译:从文本到视频:利用中级语义进行大规模视频分类

获取原文

摘要

Automatically classifying large scale of video data is an urgent yet challenging task. To bridge the semantic gap between low-level features and high-level video semantics, we propose a method to represent videos with their mid-level semantics. Inspired by the problem of text classification, we regard the visual objects in videos as the words in documents, and adapt the TF-IDF word weighting method to encode videos by visual objects. Some extensions upon the proposed method are also made according to the characteristics of videos. We integrate the proposed semantic encoding method with the popular two-stream CNN model for video classification. Experiments are conducted on two large-scale video datasets, CCV and ActivityNet. The experimanetal results validates the effectiveness of our method.
机译:自动分类大量视频数据是一项紧迫而又具有挑战性的任务。为了弥合低级功能和高级视频语义之间的语义鸿沟,我们提出了一种以中级语义表示视频的方法。受文本分类问题的启发,我们将视频中的视觉对象视为文档中的单词,并采用TF-IDF单词加权方法对视频进行视觉对象编码。还根据视频的特性对提出的方法进行了一些扩展。我们将提出的语义编码方法与流行的两流CNN模型进行视频分类。实验是在两个大型视频数据集CCV和ActivityNet上进行的。实验结果验证了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号