首页> 外文会议>International Conference on Computer Supported Cooperative Work in Design >Learning from Audience Interaction: Multi-Instance Multi-Label Topic Model for Video Shots Annotating
【24h】

Learning from Audience Interaction: Multi-Instance Multi-Label Topic Model for Video Shots Annotating

机译:从观众互动学习:视频截图的多实例多标签主题模型注释

获取原文

摘要

In recent years, audiences can find their interested TV play or movie videos by labels easily. However, for finding shots with certain semantic content in these videos, it is still a problem to annotate video shots by labels. Some existing approaches train models with annotated shots which cost a lot in labeling manually. Some other methods in solving this kind of task assume that the content of a video is only limited in the labels of the video. They ignore that the labels of a video are too coarse-grained to cover all content of the video. In this paper, we propose a multi-label, multi-instance topic model to annotate video shots by video labels. In a multi-label, multi-instance framework, video shots can be regarded as instances and shot labels are learned from labels in video level which makes the cost of labeling cheaper. On the other hand, our model learns label semantics by controlling the relationship between video labels and shots to solve coarse-grained problem. Furthermore, we also learn keywords for every video. The experiments on a large-scale real-world dataset show that our model outperforms other baseline models substantially.
机译:近年来,受众可以轻松找到他们感兴趣的电视播放或电影视频。但是,要在这些视频中查找具有某些语义内容的镜头,请通过标签注释视频镜头仍然是一个问题。一些现有的方法用注释镜头列车模型,在手动标签中成本很多。解决这种任务的其他一些方法假设视频的内容仅限于视频的标签中。它们忽略了视频的标签太粗糙,以覆盖视频的所有内容。在本文中,我们提出了一个多标签,多实例主题模型来通过视频标签注释视频拍摄。在多标签,多实例框架中,视频截图可以被视为实例,并且从视频级别中从标签中了解了拍摄标签,这使得标记更便宜的成本。另一方面,我们的模型通过控制视频标签和射击之间的关系来了解标签语义来解决粗粒度的问题。此外,我们还为每个视频学习关键字。在大型现实世界数据集上的实验表明,我们的模型显着优于其他基线模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号