Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors

Syed Sameed Husain; Miroslaw Bober

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors

【24h】

Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors

机译：通过鲁棒的局部描述符聚合来改善大规模图像检索

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.

机译：视觉搜索和图像检索是众多应用程序的基础，但是由于对象外观的可变性和数据库规模的不断增加（通常超过数十亿个图像），该任务仍然具有挑战性。现有技术方法通过包括视觉词袋（BoW），局部聚集描述符向量（VLAD）和费舍尔向量（FV）的机制依赖于局部尺度不变的描述符（例如SIFT）的聚集。但是，它们的性能仍未达到要求。本文提出了一种新颖的方法来获得紧凑而独特的图像内容表示方法，称为具有增白功能的鲁棒视觉描述符（RVD-W）。它大大提高了技术水平并提供了世界一流的性能。在我们的方法中，将本地描述符分级分配给多个群集。然后，在每个聚类中计算残差矢量，使用方向保留的归一化函数对其进行归一化，然后根据邻域等级进行汇总。重要的是，残差矢量在聚合之前在每个群集中不相关且变白，从而导致每个维度上的能量分配平衡，并显着提高了性能。我们还提出了一种新的PCA后标准化方法，该方法可提高匹配的全局描述符和不匹配的全局描述符之间的可分离性。这种新的规范化不仅使我们的RVD-W描述符受益，而且改进了基于FV和VLAD聚合的现有方法。此外，我们表明使用手工制作的SIFT功能开发的聚合框架在基于卷积神经网络（CNN）的功能上也表现出色。 RVD-W管道在Holidays和Oxford数据集上均优于最新的全局描述符。在大型数据集（Holidays1M和Oxford1M）上，基于SIFT的RVD-W表示的mAP分别为45.1％和35.1％，而基于CNN的RVD-W的mAP分别为63.5和44.8％，均表现出优于州政府的性能。最先进的。

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence 》 |2017年第9期| 1783-1796| 共14页
作者
Syed Sameed Husain; Miroslaw Bober;
展开▼
作者单位

Department of Electrical Engineering, Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, Guildford, Surrey, United Kingdom;

Department of Electrical Engineering, Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, Guildford, Surrey, United Kingdom;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Robustness; Visualization; Principal component analysis; Image retrieval; Vocabulary; Pipelines; Multimedia communication;

机译：鲁棒性;可视化;主成分分析;图像检索;词汇;管道;多媒体通信;

相似文献

外文文献
中文文献
专利

1. New local sedec pattern descriptor for improving the retrieval efficiency in content-based image retrieval [J] . S. Umamaheswaran, N. Suresh Kumar, K. Ganesh, International journal of business information systems . 2018 ,第3期

机译：新的本地sedec模式描述符，用于提高基于内容的图像检索中的检索效率
2. Local neighbourhood-based robust colour occurrence descriptor for colour image retrieval [J] . Dubey Shiv Ram, Singh Satish Kumar, Kumar Singh Rajat Image Processing, IET . 2015 ,第7期

机译：基于局部邻域的鲁棒色彩出现描述符，用于彩色图像检索
3. Boosting local texture descriptors with Log-Gabor filters response for improved image retrieval [J] . Ekta Walia, Vishal Verma International Journal of Multimedia Information Retrieval . 2016 ,第3期

机译：利用Log-Gabor滤波器响应增强局部纹理描述符，以改善图像检索
4. Fusing Local Image Descriptors for Large-Scale Image Retrieval [C] . Horster, E., Lienhart, Computer Vision and Pattern Recognition (CVPR), 2007 IEEE Conference on . 2007

机译：融合本地图像描述符以进行大规模图像检索
5. Combining local descriptors and using LSH for efficient image retrieval. [D] . Belure, Sandeep Vishwanath. 2011

机译：结合局部描述符并使用LSH进行有效的图像检索。
6. A Robust Indoor Localization System Integrating Visual Localization Aided by CNN-Based Image Retrieval with Monte Carlo Localization [O] . Song Xu, Wusheng Chou, Hongyi Dong 2019

机译：结合基于CNN图像检索和蒙特卡洛定位技术的视觉定位技术的稳健室内定位系统
7. Improving large-scale image retrieval through robust aggregation of local descriptors [O] . Husain S, Bober M 2016

机译：通过健壮的本地描述符聚合来改善大规模图像检索
8. Large-scale indexing and retrieval system for local image features [R] . Kwong, M. K. , Lin, B. 1997

机译：用于局部图像特征的大规模索引和检索系统

Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors

摘要

著录项

相似文献

相关主题

期刊订阅