首页> 外文期刊>IEEE Transactions on Image Processing >Generalized Coupled Dictionary Learning Approach With Applications to Cross-Modal Matching
【24h】

Generalized Coupled Dictionary Learning Approach With Applications to Cross-Modal Matching

机译:广义耦合字典学习方法及其在跨模态匹配中的应用

获取原文
获取原文并翻译 | 示例

摘要

Coupled dictionary learning (CDL) has recently emerged as a powerful technique with wide variety of applications ranging from image synthesis to classification tasks. In this paper, we extend the existing CDL approaches in two aspects to make them more suitable for the task of cross-modal matching. Data coming from different modalities may or may not be paired. For example, for image–text retrieval problem, 100 images of a class are available as opposed to only 50 samples of text data for training. Current CDL approaches are not designed to handle such scenarios, where classes of data points in one modality correspond to classes of data points in the other modality. Given the data from the two modalities, first two dictionaries are learnt for the respective modalities, so that the data have a sparse representation with respect to their own dictionaries. Then, the sparse coefficients from the two modalities are transformed in such a manner that data from the same class are maximally correlated, while that from different classes have very less correlation. This way of modeling the coupling between the sparse representations of the two modalities makes this approach work seamlessly for paired as well as unpaired data. The discriminative coupling term also makes the approach better suited for classification tasks. Experiments on different publicly available cross-modal data sets, namely, CUHK photosketch face data set, HFB visible and near-infrared facial images data set, IXMAS multiview action recognition data set, wiki image and text data set and Multiple Features data set, show that this generalized CDL approach performs better than the state-of-the-art for both paired as well as unpaired data.
机译:耦合字典学习(CDL)最近已成为一种强大的技术,具有从图像合成到分类任务的广泛应用。在本文中,我们从两个方面扩展了现有的CDL方法,以使其更适合跨模式匹配的任务。来自不同模态的数据可能会配对,也可能不会配对。例如,对于图像-文本检索问题,一个类别的100张图像可用,而不是仅50个用于训练的文本数据样本。当前的CDL方法未设计为处理这样​​的场景,其中一个模态中的数据点类别对应于另一模态中的数据点类别。给定来自这两种模态的数据,就为各自的模态学习了前两个词典,因此该数据相对于它们自己的词典具有稀疏表示。然后,对两种模态的稀疏系数进行转换,以使来自同一类别的数据具有最大的相关性,而来自不同类别的数据具有极少的相关性。这种对两种模态的稀疏表示之间的耦合进行建模的方式使该方法对于成对和不成对的数据都可以无缝工作。区分耦合术语也使该方法更适合分类任务。在不同的公开交叉模式数据集上进行实验,这些数据集分别是:香港中文大学的照片素描脸数据集,HFB可见和近红外脸部图像数据集,IXMAS多视图动作识别数据集,Wiki图像和文本数据集以及“多个功能”数据集这种通用的CDL方法在配对和非配对数据方面的表现都优于最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号