首页> 外文期刊>IEEE Transactions on Image Processing >Zero-Shot Learning via Robust Latent Representation and Manifold Regularization
【24h】

Zero-Shot Learning via Robust Latent Representation and Manifold Regularization

机译:通过稳健的潜在表示和流形正则化进行零散学习

获取原文
获取原文并翻译 | 示例
       

摘要

Zero-shot learning (ZSL) for visual recognition aims to accurately recognize the objects of unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. However, the semantic gap across visual features and their underlying semantics is still a big obstacle in ZSL. Conventional ZSL methods construct that the mapping typically focus on the original visual features that are independent of the ZSL tasks, thus degrading the prediction performance. In this paper, we propose an effective method to uncover an appropriate latent representation of data for the purpose of zero-shot classification. Specifically, we formulate a novel framework to jointly learn the latent subspace and cross-modal embedding to link visual features with their semantic representations. The proposed framework combines feature learning and semantics prediction, such that the learned data representation is more discriminative to predict the semantic vectors, hence improving the overall classification performance. To learn a robust latent subspace, we explicitly avoid the information loss by ensuring the reconstruction ability of the obtained data representation. An efficient algorithm is designed to solve the proposed optimization problem. To fully exploit the intrinsic geometric structure of data, we develop a manifold regularization strategy to refine the learned semantic representations, leading to further improvements of the classification performance. To validate the effectiveness of the proposed approach, extensive experiments are conducted on three ZSL benchmarks and encouraging results are achieved compared with the state-of-the-art ZSL methods.
机译:用于视觉识别的零射击学习(ZSL)旨在通过将视觉特征映射到由类语义信息跨越的嵌入空间来准确识别未看见类的对象。但是,跨视觉功能及其潜在语义的语义鸿沟仍然是ZSL的一大障碍。传统的ZSL方法构造为,映射通常集中在独立于ZSL任务的原始视觉特征上,从而降低了预测性能。在本文中,我们提出了一种有效的方法来发现适当的潜在数据表示,以实现零镜头分类。具体来说,我们制定了一个新颖的框架来共同学习潜在子空间和交叉模态嵌入,以将视觉特征与其语义表示联系起来。所提出的框架将特征学习和语义预测相结合,从而使学习到的数据表示更具区分性,可以预测语义向量,从而提高整体分类性能。为了学习鲁棒的潜在子空间,我们通过确保获得的数据表示的重构能力来明确避免信息丢失。设计了一种有效的算法来解决所提出的优化问题。为了充分利用数据的固有几何结构,我们开发了一种流形正则化策略来细化所学习的语义表示,从而进一步提高了分类性能。为了验证所提出方法的有效性,在三个ZSL基准上进行了广泛的实验,与最新的ZSL方法相比,获得了令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号