Aggregating Image and Text Quantized Correlated Components

机译：聚集图像和文本量化的相关组件

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cross-modal tasks occur naturally for multimedia content that can be described along two or more modalities like visual content and text. Such tasks require to "translate" information from one modality to another. Methods like kernelized canonical correlation analysis (KCCA) attempt to solve such tasks by finding aligned subspaces in the description spaces of different modalities. Since they favor correlations against modality-specific information, these methods have shown some success in both cross-modal and bi-modal tasks. However, we show that a direct use of the subspace alignment obtained by KCCA only leads to coarse translation abilities. To address this problem, we first put forward a new representation method that aggregates information provided by the projections of both modalities on their aligned subspaces. We further suggest a method relying on neighborhoods in these subspaces to complete uni-modal information. Our proposal exhibits state-of-the-art results for bi-modal classification on Pascal VOC07 and improves it by over 60% for cross-modal retrieval on FlickR 8K/30K.

机译：跨模式任务对于多媒体内容自然会发生，可以通过两种或多种模式（如视觉内容和文本）进行描述。这样的任务需要将信息从一种模态“翻译”到另一种模态。诸如核化规范相关分析（KCCA）之类的方法试图通过在不同模态的描述空间中找到对齐的子空间来解决此类任务。由于它们支持针对特定于模态的信息进行关联，因此这些方法在跨模态和双模态任务中均显示出一定的成功。但是，我们表明，直接使用KCCA获得的子空间对齐方式只会导致粗略的翻译能力。为了解决这个问题，我们首先提出了一种新的表示方法，该方法将两个模态的投影在它们对齐的子空间上的聚集信息汇总在一起。我们进一步建议一种依靠这些子空间中的邻域来完成单峰信息的方法。我们的建议展示了Pascal VOC07上双模式分类的最新结果，并在FlickR 8K / 30K上的跨模式检索中将其改进了60％以上。

著录项

来源
《IEEE Conference on Computer Vision and Pattern Recognition》|2016年|2046-2054|共9页
会议地点
作者
Thi Quynh Nhi Tran; Hervé Le Borgne; Michel Crucianu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Multimedia communication; Vocabulary; Kernel; Correlation; Semantics; Flickr;

机译：可视化;多媒体通信;词汇;内核;关联;语义; Flickr;

相似文献

外文文献
中文文献
专利

1. A generalized interpolative vector quantization method for jointly optimal quantization, interpolation, and binarization of text images [J] . Fekri F., Mersereau R.M. IEEE Transactions on Image Processing . 2000,第7期

机译：一种用于文本图像联合优化量化，内插和二值化的广义内插矢量量化方法
2. Learning-based superresolution algorithm using quantized pattern and bimodal postprocessing for text images [J] . Lee Hui Jung, Choi Dong-Yoon, Song Byung Cheol Journal of electronic imaging . 2015,第6期

机译：基于学习的基于量化模式和双峰后处理的文本图像超分辨率算法
3. Helpful Text Correlates Breast Images [J] . Radiologic Technology . 2003,第6期

机译：有用的文字与乳房图像相关
4. Aggregating Image and Text Quantized Correlated Components [C] . Thi Quynh Nhi Tran, Hervé Le Borgne, Michel Crucianu IEEE Conference on Computer Vision and Pattern Recognition . 2016

机译：聚合图像和文本量化相关分量
5. Real-time imaging systems for synthetic aperture radar using course quantized correlators with VLSI realization. [D] . Kamath, Jawahar H. 1995

机译：用于合成孔径雷达的实时成像系统，使用带有VLSI实现的过程量化相关器。
6. Emotional salience of the image component facilitates recall of the text of cigarette warning labels [O] . An-Li Wang, Zhenhao Shi, Victoria P Fairchild, -1

机译：图像成分的情感突出性有助于召回香烟警告标签的文本
7. Aggregating Image and Text Quantized Correlated Components [O] . Thi Quynh Nhi Tran, Herve Le Borgne, Michel Crucianu 2016

机译：聚合图像和文本量化相关分量

Aggregating Image and Text Quantized Correlated Components

摘要

著录项

相似文献

相关主题

期刊订阅