...
首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Learning from Collective Intelligence: Feature Learning Using Social Images and Tags
【24h】

Learning from Collective Intelligence: Feature Learning Using Social Images and Tags

机译:向集体智慧学习:使用社交图像和标签进行特征学习

获取原文
获取原文并翻译 | 示例
           

摘要

Feature representation for visual content is the key to the progress of many fundamental applications such as annotation and cross-modal retrieval. Although recent advances in deep feature learning offer a promising route towards these tasks, they are limited in application domains where high-quality and large-scale training data are expensive to obtain. In this article, we propose a novel deep feature learning paradigm based on social collective intelligence, which can be acquired from the inexhaustible social multimedia content on the Web, in particular, largely social images and tags. Differing from existing feature learning approaches that rely on high-quality image-label supervision, our weak supervision is acquired by mining the visual-semantic embeddings from noisy, sparse, and diverse social image collections. The resultant image word embedding space can be used to (1) fine-tune deep visual models for low-level feature extractions and (2) seek sparse representations as high-level cross-modal features for both image and text. We offer an easy-to-use implementation for the-proposed paradigm, which is fast and compatible with any state-of-the-art deep architectures. Extensive experiments on several benchmarks demonstrate that the cross-modal features learned by our paradigm significantly outperforms others in various applications such as content based retrieval, classification, and image captioning.
机译:视觉内容的特征表示是许多基本应用(例如注释和跨模式检索)取得进展的关键。尽管深度特征学习的最新进展为实现这些任务提供了一条有希望的途径,但它们在获得高质量和大规模培训数据的成本昂贵的应用领域受到了限制。在本文中,我们提出了一种基于社会集体智慧的新颖的深度特征学习范例,该范例可以从网络上取之不尽的社交多媒体内容(尤其是大部分社交图像和标签)中获取。与现有的依靠高质量图像标签监督的特征学习方法不同,我们的弱监督是通过从嘈杂,稀疏和多样的社会图像集中挖掘视觉语义嵌入而获得的。生成的图像词嵌入空间可用于(1)微调用于低级特征提取的深度视觉模型,以及(2)寻求稀疏表示作为图像和文本的高级交叉模式特征。我们为拟议的范式提供了易于使用的实现,该实现快速且与任何最新的深度架构兼容。在多个基准上进行的广泛实验表明,我们的范式学习到的跨模式功能在各种应用(例如基于内容的检索,分类和图像字幕)中明显优于其他模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号