首页> 外文会议>IEEE International Conference on Big Data >CUImage: A Neverending Learning Platform on a Convolutional Knowledge Graph of Billion Web Images
【24h】

CUImage: A Neverending Learning Platform on a Convolutional Knowledge Graph of Billion Web Images

机译:CUIMAGE:纽约卷积知识图形的永安学习平台十亿网络图像

获取原文

摘要

Pretraining visual features by image classification on ImageNet is an indispensable step towards many advanced perception systems in the last decade. ImageNet is the most prevalent database for supervised pretraining of image features. Unlike ImageNet assuming that the visual concepts are static and independent with each other, this work presents a neverending learning platform, termed CUImage, which learns visual representation on a knowledge graph of billions of images, whose data scale is several orders of magnitude larger than ImageNet. A novel dynamic graph convolutional network (GCN) is proposed to learn visual concepts. Once the new data are presented, the GCN is updated dynamically where new concepts can be discovered or existing concepts can be merged. This is enabled by three main components in CUImage, including Data Dispersion (DD), Data Management and Mining (DMM), and Data Evaluation (DE). These three components are built on top of a computer cluster with thousands of GPU/CPU cores and a parallel storage of petabytes. So far, CUImage has processed and managed more than 2 million visual concepts of 2 billion images. To evaluate the learned representation, we transfer the pretrained features to several challenging benchmarks such as image recognition on ImageNet and object detection in MS-COCO. We achieve state-of-the-art results, significantly surpassing the systems that used ImageNet for pretraining. The codes, data, and models will be released.
机译:通过ImageNet的图像分类预先预订的视觉特征是在过去十年中迈向许多先进感知系统的必不可少的步骤。 ImageNet是最普遍的数据库,用于监督图像特征的预先预测。与ImageNet不同,假设视觉概念彼此静态且独立,这项工作提出了一个令人不安的学习平台,称为CUIMAGE,其在数十亿图像上了解了关于的知识图表,其数据刻度是大于想象成的几个数量级。提出了一种新颖的动态图形卷积网络(GCN)来学习视觉概念。一旦呈现了新数据,GCN就会动态更新,其中可以发现新概念或可以合并现有概念。这是由三个主要组件的CUIMAGE中的,包括数据分散(DD),数据管理和挖掘(DMM)和数据评估(DE)。这三个组件内置于计算机集群的顶部,其中数千个GPU / CPU内核以及Petabytes的并行存储。到目前为止,CUIMAGE已经处理并管理了200亿个图像的200多百万多次视觉概念。为了评估学习的表示,我们将普试特征转移到几个具有挑战性的基准测试,例如在MS-Coco中的想象人和对象检测上的图像识别。我们实现最先进的结果,显着超越了使用想象成用于预磨损的系统。代码,数据和模型将被释放。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号