...
首页> 外文期刊>Neurocomputing >Combining multi-representation for multimedia event detection using co-training
【24h】

Combining multi-representation for multimedia event detection using co-training

机译:使用联合训练将多表示结合用于多媒体事件检测

获取原文
获取原文并翻译 | 示例
           

摘要

In recent years, multimedia event detection has been attracting extensive research attention because of the exponential increase in volume of web video data. Traditional approaches usually utilize single visual representation, which may suffer from the problem of insufficient descriptive power. How to jointly employ multiple types of visual representation to facilitate multimedia event detection (MED) in videos remains an open problem. In this work, we propose a novel system for event detection based on combination of multi-view representations and co-training algorithm. Specifically, given several types of low-level visual features (i.e., Convolutional Neural Networks (CNNs) and Fisher vector), we first train an initial classifier for each type of visual feature. Then, we use these classifiers to separately predict labels of unlabeled videos, and those with consistent prediction are merged into the training set. We alternatively repeat the processes of training the classifiers and enlarging the training set until convergence, To investigate the relationship among different types of visual features, the prediction scores of the two classifiers are fused by a linear weighted fusion method. We evaluate our MED system on the TRECVID MED11 data set, and the experimental results have demonstrated the outstanding performance of the proposed approach as compared to several other state-of-the-art algorithms. (C) 2016 Elsevier B.V. All rights reserved.
机译:近年来,由于网络视频数据量呈指数级增长,多媒体事件检测已引起广泛的研究关注。传统方法通常使用单一的视觉表示,这可能会遭受描述能力不足的问题。如何联合使用多种类型的视觉表示来促进视频中的多媒体事件检测(MED)仍然是一个悬而未决的问题。在这项工作中,我们提出了一种基于多视图表示和协同训练算法的事件检测新系统。具体来说,给定几种类型的低级视觉特征(即卷积神经网络(CNN)和Fisher向量),我们首先为每种视觉特征训练一个初始分类器。然后,我们使用这些分类器分别预测未标记视频的标签,并将具有一致预测的视频合并到训练集中。我们也可以重复训练分类器并扩大训练集直到收敛的过程。为了研究不同类型的视觉特征之间的关系,通过线性加权融合方法将两个分类器的预测得分融合。我们在TRECVID MED11数据集上评估了我们的MED系统,并且实验结果证明了与其他几种最新算法相比,该方法的出色性能。 (C)2016 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2016年第12期|11-18|共8页
  • 作者单位

    Univ Elect Sci & Technol China, Chengdu, Peoples R China;

    Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China;

    Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China;

    Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Multimedia event detection; Convolutional neural network; Co-training;

    机译:多媒体事件检测;卷积神经网络;协同训练;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号