首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Multimodal Visual Concept Learning with Weakly Supervised Techniques
【24h】

Multimodal Visual Concept Learning with Weakly Supervised Techniques

机译:弱监督技术的多模式视觉概念学习

获取原文

摘要

Despite the availability of a huge amount of video data accompanied by descriptive texts, it is not always easy to exploit the information contained in natural language in order to automatically recognize video concepts. Towards this goal, in this paper we use textual cues as means of supervision, introducing two weakly supervised techniques that extend the Multiple Instance Learning (MIL) framework: the Fuzzy Sets Multiple Instance Learning (FSMIL) and the Probabilistic Labels Multiple Instance Learning (PLMIL). The former encodes the spatio-temporal imprecision of the linguistic descriptions with Fuzzy Sets, while the latter models different interpretations of each description's semantics with Probabilistic Labels, both formulated through a convex optimization algorithm. In addition, we provide a novel technique to extract weak labels in the presence of complex semantics, that consists of semantic similarity computations. We evaluate our methods on two distinct problems, namely face and action recognition, in the challenging and realistic setting of movies accompanied by their screenplays, contained in the COGNIMUSE database. We show that, on both tasks, our method considerably outperforms a state-of-the-art weakly supervised approach, as well as other baselines.
机译:尽管可获得大量带有描述性文本的视频数据,但并非总是容易利用自然语言中包含的信息来自动识别视频概念。为了实现这一目标,在本文中,我们使用文本提示作为监督手段,介绍了两种扩展多实例学习(MIL)框架的弱监督技术:模糊集多实例学习(FSMIL)和概率标签多实例学习(PLMIL) )。前者使用模糊集对语言描述的时空不精确性进行编码,而后者使用概率标签对每个描述的语义进行不同的解释,二者均通过凸优化算法来表示。此外,我们提供了一种在复杂语义存在下提取弱标签的新技术,该技术由语义相似度计算组成。我们在COGNIMUSE数据库中包含的具有挑战性和逼真的电影以及其剧本的背景下,针对两个不同的问题,即面部和动作识别,评估了我们的方法。我们表明,在这两个任务上,我们的方法都大大优于最新的弱监督方法以及其他基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号