首页> 外文期刊>Multimedia Tools and Applications >One net to rule them all: efficient recognition and retrieval of POI from geo-tagged photos
【24h】

One net to rule them all: efficient recognition and retrieval of POI from geo-tagged photos

机译:一网打尽一切:从带有地理标签的照片中有效识别和检索POI

获取原文
获取原文并翻译 | 示例
           

摘要

In this work, we present DeepCamera, a novel framework that combines visual recognition and spatial recognition for identifying places-of-interest (POIs) from smartphone photos. Both deep visual features and geographic features of images are explored in our framework. For visual recognition, we first design the HashNet model extended from an ordinary convolutional neural network (ConvNet) by adding a "hash layer" following the last fully connected layer. Furthermore, we compress multiple pre-trained deep HashNets into one single shallow and hash network namely "SHNet" that outputs semantic labels and compact hash codes simultaneously. As a result, it significantly reduces the time and memory consumption during POI recognition. For spatial recognition, a new layer called Spatial Layer is appended to a ConvNet to capture spatial information. Finally, both visual and spatial knowledge contribute to generating a hybrid probability distribution over all possible POI candidates by plugging the spatial layer into SHNet. Notably, the proposed SHNet model can be used for general visual recognition and retrieval. The experiments conducted on real-world datasets and classic datasets (MNIST and CIFAR-10) demonstrate the competitive accuracy and run-time performance of our proposed framework.
机译:在这项工作中,我们介绍了DeepCamera,这是一个新颖的框架,将视觉识别和空间识别相结合,可以从智能手机照片中识别兴趣点(POI)。在我们的框架中探索了图像的深层视觉特征和地理特征。为了进行视觉识别,我们首先设计HashNet模型,该模型是从普通的卷积神经网络(ConvNet)扩展而来的,方法是在最后一个完全连接的层之后添加“哈希层”。此外,我们将多个经过预先训练的深层HashNets压缩为一个单一的浅层哈希网络,即“ SHNet”,该网络同时输出语义标签和紧凑的哈希码。结果,它大大减少了POI识别期间的时间和内存消耗。为了进行空间识别,将一个称为“空间层”的新层附加到ConvNet上以捕获空间信息。最后,通过将空间层插入SHNet,视觉和空间知识都有助于在所有可能的POI候选对象上生成混合概率分布。值得注意的是,提出的SHNet模型可以用于一般的视觉识别和检索。在现实数据集和经典数据集(MNIST和CIFAR-10)上进行的实验证明了我们提出的框架的竞争准确性和运行时性能。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2019年第17期|24347-24371|共25页
  • 作者单位

    YoutuLab Tencent Technol Shanghai Co, Shanghai 200233, Peoples R China;

    Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Zhejiang, Peoples R China;

    Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Zhejiang, Peoples R China;

    Zhejiang Univ, Coll Comp Sci & Technol, Database Lab, Hangzhou 310007, Zhejiang, Peoples R China;

    Zhejiang Univ, Coll Comp Sci & Technol, Database Lab, Hangzhou 310007, Zhejiang, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Places-of-interest; Image recognition; Image retrieval; Deep hashing;

    机译:兴趣之地;图像识别;图像检索;深散;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号