首页> 外文OA文献 >Linear-Complexity Relaxed Word Mover's Distance with GPU Acceleration
【2h】

Linear-Complexity Relaxed Word Mover's Distance with GPU Acceleration

机译:线性复杂度放松了Word mover与GpU加速的距离

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The amount of unstructured text-based data is growing every day. Querying,clustering, and classifying this big data requires similarity computationsacross large sets of documents. Whereas low-complexity similarity metrics areavailable, attention has been shifting towards more complex methods thatachieve a higher accuracy. In particular, the Word Mover's Distance (WMD)method proposed by Kusner et al. is a promising new approach, but its timecomplexity grows cubically with the number of unique words in the documents.The Relaxed Word Mover's Distance (RWMD) method, again proposed by Kusner etal., reduces the time complexity from qubic to quadratic and results in alimited loss in accuracy compared with WMD. Our work contributes alow-complexity implementation of the RWMD that reduces the average timecomplexity to linear when operating on large sets of documents. Ourlinear-complexity RWMD implementation, henceforth referred to as LC-RWMD, mapswell onto GPUs and can be efficiently distributed across a cluster of GPUs. Ourexperiments on real-life datasets demonstrate 1) a performance improvement oftwo orders of magnitude with respect to our GPU-based distributedimplementation of the quadratic RWMD, and 2) a performance improvement of threeto four orders of magnitude with respect to our distributed WMD implementationthat uses GPU-based RWMD for pruning.
机译:每天基于文本的非结构化数据量都在增长。查询,聚类和分类这些大数据需要跨大型文档集进行相似度计算。尽管可以使用低复杂度的相似性度量标准,但人们的注意力已转向可实现更高准确度的更复杂的方法。特别是,库斯纳(Kusner)等人提出的单词移动距离(WMD)方法。虽然是一种很有前途的新方法,但其时间复杂度随文档中唯一词的数量呈三次方增长.Kusner等人再次提出的弛豫词移动距离(RWMD)方法将时间复杂度从qubic降低为二次方,结果是有限的与大规模杀伤性武器相比,精度下降。我们的工作有助于实现RWMD的低复杂度实现,从而在处理大量文档时将平均时间复杂度降低为线性。我们的线性复杂性RWMD实现(以下称为LC-RWMD)可以映射到GPU上,并且可以有效地分布在GPU集群中。我们在现实数据集上的实验表明,1)与基于GPU的二次RWMD分布式实现相比,性能提高了两个数量级,以及2)与我们使用GPU的分布式WMD实现相比,性能提高了三至四个数量级。基于RWMD的修剪。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号