首页> 外文期刊>Image Processing, IEEE Transactions on >Multimodal Task-Driven Dictionary Learning for Image Classification
【24h】

Multimodal Task-Driven Dictionary Learning for Image Classification

机译:多模式任务驱动词典学习的图像分类

获取原文
获取原文并翻译 | 示例

摘要

Dictionary learning algorithms have been successfully used for both reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are mostly developed for single-modality scenarios, recent studies have demonstrated the advantages of feature-level fusion based on the joint sparse representation of the multimodal inputs. In this paper, we propose a multimodal task-driven dictionary learning algorithm under the joint sparsity constraint (prior) to enforce collaborations among multiple homogeneous/heterogeneous sources of information. In this task-driven formulation, the multimodal dictionaries are learned simultaneously with their corresponding classifiers. The resulting multimodal dictionaries can generate discriminative latent features (sparse codes) from the data that are optimized for a given task such as binary or multiclass classification. Moreover, we present an extension of the proposed formulation using a mixed joint and independent sparsity prior, which facilitates more flexible fusion of the modalities at feature level. The efficacy of the proposed algorithms for multimodal classification is illustrated on four different applications—multimodal face recognition, multi-view face recognition, multi-view action recognition, and multimodal biometric recognition. It is also shown that, compared with the counterpart reconstructive-based dictionary learning algorithms, the task-driven formulations are more computationally efficient in the sense that they can be equipped with more compact dictionaries and still achieve superior performance.
机译:字典学习算法已成功用于重构和判别任务,其中输入信号用字典原子的稀疏线性组合表示。虽然这些方法主要是针对单模式场景开发的,但最近的研究已经证明了基于多模式输入的联合稀疏表示的特征级融合的优势。在本文中,我们提出了一种在联合稀疏约束(先前)下的多模式任务驱动字典学习算法,以强制多个同质/异质信息源之间的协作。在这种任务驱动的表述中,多模式词典及其相应的分类器是同时学习的。生成的多峰词典可以从针对给定任务(例如二进制或多类分类)优化的数据中生成判别性潜在特征(稀疏代码)。此外,我们提出了使用混合联合和独立稀疏性的拟议公式的扩展,这有助于在特征级别更灵活地融合模态。提出的算法用于多峰分类的功效在四个不同的应用上得到了说明-多峰面部识别,多视图面部识别,多视图动作识别和多模式生物特征识别。还表明,与相应的基于重构的字典学习算法相比,任务驱动的公式在可以配备更为紧凑的词典的同时仍能实现出色的性能,在计算效率上更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号