首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Shuffled ImageNet Banks for Video Event Detection and Search
【24h】

Shuffled ImageNet Banks for Video Event Detection and Search

机译:随机播放的想象群,用于视频事件检测和搜索

获取原文
获取原文并翻译 | 示例

摘要

This article aims for the detection and search of events in videos, where video examples are either scarce or even absent during training. To enable such event detection and search, ImageNet concept banks have shown to be effective. Rather than employing the standard concept bank of 1,000 ImageNet classes, we leverage the full 21,841-class dataset. We identify two problems with using the full dataset: (ⅰ) there is an imbalance between the number of examples per concept, and (ⅱ) not all concepts are equally relevant for events. In this article, we propose to balance large-scale image hierarchies for pre-training. We shuffle concepts based on bottom-up and top-down operations to overcome the problems of example imbalance and concept relevance. Using this strategy, we arrive at the shuffled ImageNet bank, a concept bank with an order of magnitude more concepts compared to standard ImageNet banks. Compared to standard ImageNet pre-training, our shuffles result in more discriminative representations to train event models from the limited video event examples. For event search, the broad range of concepts enable a closer match between textual queries of events and concept detections in videos. Experimentally, we show the benefit of the proposed bank for event detection and event search, with state-of-the-art performance for both tasks on the challenging TRECVID Multimedia Event Detection and Ad-Hoc Video Search benchmarks.
机译:本文旨在检测和搜索视频中的事件,其中视频示例在训练期间稀缺甚至缺席。为了使此类事件检测和搜索,Imagenet概念库已显示有效。我们不是使用1,000个ImageNet类的标准概念库,而是利用完整的21,841级数据集。我们使用完整数据集来确定两个问题:(Ⅰ)每个概念的示例数量之间存在不平衡,并且(Ⅱ)并非所有概念都与事件同样相关。在本文中,我们建议平衡大规模图像层次进行预培训。我们基于自下而上和自上而下的操作来克服概念,以克服示例不平衡和概念相关性的问题。使用这种策略,我们到达被播放的想象群银行,一个概念库,与标准想象库银行相比,一个概念更大的概念。与标准ImageNet预培训相比,我们的洗牌导致更多歧视性表示从有限的视频事件示例列出事件模型。对于事件搜索,广泛的概念可以在视频的文本查询和视频中的概念检测之间实现更近的匹配。在实验上,我们展示了拟议的银行进行事件检测和事件搜索的好处,最先进的性能,适用于挑战的TRECVID多媒体事件检测和Ad-Hoc视频搜索基准测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号