首页> 外文OA文献 >ForestHash: Semantic Hashing with Shallow Random Forests and Tiny Convolutional Networks
【2h】

ForestHash: Semantic Hashing with Shallow Random Forests and Tiny Convolutional Networks

机译:ForestHash:浅层随机森林和微小的语义哈希  卷积网络

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Hash codes are efficient data representations for coping with the evergrowing amounts of data. In this paper, we introduce a random forest semantichashing scheme that embeds tiny convolutional neural networks (CNN) intoshallow random forests, with near-optimal information-theoretic codeaggregation among trees. We start with a simple hashing scheme, where randomtrees in a forest act as hashing functions by setting `1' for the visited treeleaf, and `0' for the rest. We show that traditional random forests fail togenerate hashes that preserve the underlying similarity between the trees,rendering the random forests approach to hashing challenging. To address this,we propose to first randomly group arriving classes at each tree split nodeinto two groups, obtaining a significantly simplified two-class classificationproblem, which can be handled using a light-weight CNN weak learner. Suchrandom class grouping scheme enables code uniqueness by enforcing each class toshare its code with different classes in different trees. A non-conventionallow-rank loss is further adopted for the CNN weak learners to encourage codeconsistency by minimizing intra-class variations and maximizing inter-classdistance for the two random class groups. Finally, we introduce aninformation-theoretic approach for aggregating codes of individual trees into asingle hash code, producing a near-optimal unique hash for each class. Theproposed approach significantly outperforms state-of-the-art hashing methodsfor image retrieval tasks on large-scale public datasets, while performing atthe level of other state-of-the-art image classification techniques whileutilizing a more compact and efficient scalable representation. This workproposes a principled and robust procedure to train and deploy in parallel anensemble of light-weight CNNs, instead of simply going deeper.
机译:哈希代码与数据的恒丰大量应对高效的数据表示。在本文中,我们将介绍嵌入微小的卷积神经网络(CNN)intoshallow随机森林,树木之间的近似最优信息论codeaggregation随机森林semantichashing方案。我们先从一个简单的散列方案,其中在森林中充当散列函数randomtrees通过设置'1' 的访问treeleaf,和'0' 的其余部分。我们发现,传统的随机森林失败togenerate哈希值,同时保留树木之间的基本相似,呈现随机森林的方法来散列挑战。为了解决这个问题,我们提出了在每个树分割nodeinto两组第一随机组到达类,获得显著简化两个阶级classificationproblem,其可以使用重量轻的CNN弱学习处理。 Suchrandom类分组方案通过强制每个类toshare其在不同的树不同类型的代码使代码的唯一性。非conventionallow秩损失,还通过了CNN弱学习由两个随机阶层群体减少类内变化和最大化相互classdistance鼓励codeconsistency。最后,我们介绍聚集个别树木的代码为A单哈希码,产生接近最优唯一的哈希为每个类aninformation理论方法。 Theproposed方法显著性能优于上大型公共数据集状态的最先进的散列methodsfor图像检索任务,在执行的状态的最先进的其他图像分类技术whileutilizing更紧凑和高效的可扩展的表示仅有一名水平。此workproposes一个原则性和健壮过程在重量轻的细胞神经网络的并行anensemble列车和部署,而不是简单地不断深入。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号