Learning Representations by Maximizing Mutual Information Across Views

机译：通过在视图中最大化相互信息的学习表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an approach to self-supervised representation learning based on maximizing mutual information between features extracted from multiple views of a shared context. For example, one could produce multiple views of a local spatiotemporal context by observing it from different locations (e.g., camera positions within a scene), and via different modalities (e.g., tactile, auditory, or visual). Or, an ImageNet image could provide a context from which one produces multiple views by repeatedly applying data augmentation. Maximizing mutual information between features extracted from these views requires capturing information about high-level factors whose influence spans multiple views - e.g., presence of certain objects or occurrence of certain events. Following our proposed approach, we develop a model which learns image representations that significantly outperform prior methods on the tasks we consider. Most notably, using self-supervised learning, our model learns representations which achieve 68.1% accuracy on ImageNet using standard linear evaluation. This beats prior results by over 12% and concurrent results by 7%. When we extend our model to use mixture-based representations, segmentation behaviour emerges as a natural side-effect. Our code is available online: https://github.com/Philip-Bachman/amdim-public.

机译：我们提出了一种基于从共享上下文的多个视图中提取的功能之间的最大化的相互信息来实现自我监督的表示学习方法。例如，可以通过从不同位置（例如，场景内的相机位置）和通过不同的方式（例如，触觉，听觉或视觉）来产生本地时空语调的多个视图。或者，想象成图像可以通过重复应用数据增强来提供一种上下文，从该上下文中产生多视图。最大化从这些视图中提取的特征之间的相互信息需要捕获有关影响跨越多个视图的高级因素的信息 - 例如，某些对象的存在或某些事件的发生。遵循我们提出的方法，我们开发了一种模型，该模型学习图像表示，显着优于我们考虑的任务的先前方法。最值得注意的是，使用自我监督的学习，我们的模型学习使用标准线性评估来实现在ImageNet上实现68.1％的准确性。这一结果以超过12％的结果击败，并同时结果为7％。当我们扩展我们的模型以使用基于混合的表示时，分割行为作为自然副作用出现。我们的代码可在线获取：https：//github.com/philip-bachman/amdim-公开。

著录项

来源
《Conference on Neural Information Processing Systems》|2020年|p15071-15901|共11页
会议地点
作者
Philip Bachman; R Devon Hjelm; William Buchwalter;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计量学;
关键词

相似文献

外文文献
中文文献
专利

1. Self-supervised video representation learning by maximizing mutual information [J] . Xue Fei, Ji Hongbing, Zhang Wenbo, Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2020,第1期

机译：通过最大化相互信息的自我监督视频表示学习
2. Maximized Mutual Information Analysis Based on Stochastic Representation for Process Monitoring [J] . Jiang Benben, Luo Yi, Lu Qiugang IEEE transactions on industrial informatics . 2019,第3期

机译：基于随机表示的过程监控最大化相互信息分析
3. Maximized Mutual Information Analysis Based on Stochastic Representation for Process Monitoring [J] . Jiang Benben, Luo Yi, Lu Qiugang IEEE transactions on industrial informatics . 2019,第3期

机译：基于流程监测随机表示的最大化的互信息分析
4. Learning Representations by Maximizing Mutual Information Across Views [C] . Philip Bachman, R Devon Hjelm, William Buchwalter Conference on Neural Information Processing Systems . 2020

机译：通过在视图中最大化相互信息的学习表示
5. A framework for Transfer learning: Maximization of quadratic mutual information to create discriminative subspaces. [D] . Khan, Mohammad Nazmul Alam. 2016

机译：转移学习的框架：最大化二次互信息以创建可区分的子空间。
6. Learning Domain-Independent Deep Representations by Mutual Information Minimization [O] . Ke Wang, Jiayong Liu, Jing-Yan Wang 2019

机译：通过互信息最小化学习与域无关的深度表示
7. Info3D: Representation Learning on 3D Objects Using Mutual Information Maximization and Contrastive Learning [O] . Aditya Sanghi 2020

机译：INFO3D：使用相互信息最大化和对比学习的3D对象的表示

Learning Representations by Maximizing Mutual Information Across Views

摘要

著录项

相似文献

相关主题

期刊订阅