首页> 外文期刊>International Journal of Advanced Robotic Systems >Tree-based indexing for real-time ConvNet landmark-based visual place recognition
【24h】

Tree-based indexing for real-time ConvNet landmark-based visual place recognition

机译:基于树的实时Convnet地标的视觉识别索引

获取原文
获取原文并翻译 | 示例
           

摘要

Recent impressive studies on using ConvNet landmarks for visual place recognition take an approach that involves three steps: (a) detection of landmarks, (b) description of the landmarks by ConvNet features using a convolutional neural network, and (c) matching of the landmarks in the current view with those in the database views. Such an approach has been shown to achieve the state-of-the-art accuracy even under significant viewpoint and environmental changes. However, the computational burden in step (c) significantly prevents this approach from being applied in practice, due to the complexity of linear search in high-dimensional space of the ConvNet features. In this article, we propose two simple and efficient search methods to tackle this issue. Both methods are built upon tree-based indexing. Given a set of ConvNet features of a query image, the first method directly searches the features' approximate nearest neighbors in a tree structure that is constructed from ConvNet features of database images. The database images are voted on by features in the query image, according to a lookup table which maps each ConvNet feature to its corresponding database image. The database image with the highest vote is considered the solution. Our second method uses a coarse-to-fine procedure: the coarse step uses the first method to coarsely find the top-N database images, and the fine step performs a linear search in Hamming space of the hash codes of the ConvNet features to determine the best match. Experimental results demonstrate that our methods achieve real-time search performance on five data sets with different sizes and various conditions. Most notably, by achieving an average search time of 0.035 seconds/query, our second method improves the matching efficiency by the three orders of magnitude over a linear search baseline on a database with 20,688 images, with negligible loss in place recognition accuracy.
机译:最近令人印象深刻的研究关于使用Convnet地标进行视觉地标识的方法采取了一种方法,该方法涉及三个步骤:(a)使用卷积神经网络的Convnet特征的地标检测到地标的地标,(c)与地标的匹配在当前视图中与数据库视图中的视图。即使在重要的观点和环境变化,已经显示出这种方法即使在重要的观点和环境变化也能够实现最先进的准确性。然而,步骤(c)中的计算负担显着防止了这种方法在实践中应用,这是由于线性搜索在ConvNet特征的高维空间中的复杂性。在本文中,我们提出了两个简单有效的搜索方法来解决这个问题。这两种方法都是基于树的索引。给定一组查询映像的ConverNet功能,第一个方法直接搜索从数据库图像的ConverT功能构建的树结构中的“近似最近邻居”。根据查询图像中的查询图像中的特征,根据查询表将数据库图像映射到其对应的数据库图像。具有最高投票的数据库图像被认为是解决方案。我们的第二种方法使用了一个粗短的步骤:粗略的步骤使用第一种方法来粗略找到顶部n数据库图像,并且细步骤在哈希代码的哈希代码的哈希代码中执行线性搜索以确定最好的比赛。实验结果表明,我们的方法在五个数据集中实现了实时搜索性能,具有不同大小和各种条件。最值得注意的是,通过实现0.035秒/查询的平均搜索时间,我们的第二种方法在具有20,688个图像的数据库上的线性搜索基线上通过三个数量级的匹配效率提高了匹配效率,其损失识别准确性可忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号