首页> 外文会议>Conference on empirical methods in natural language processing >Random Manhattan Integer Indexing: Incremental L_1 Normed Vector Space Construction
【24h】

Random Manhattan Integer Indexing: Incremental L_1 Normed Vector Space Construction

机译:随机曼哈顿整数索引:增量L_1范数向量空间构造

获取原文

摘要

Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in the distributional approaches to semantics. In VSMs. Ingh-dimensional vectors represent linguistic entities. In an application, the similarity of vectors-and thus the entities that they represent-is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a novel technique called Random Manhattan Indexing (RMI) for the construction of ℓ_1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incremental and thus scalable two-step procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections. We further introduce Random Manhattan Integer Indexing (RMII): a computationally enhanced version of RMI. As shown in the reported experiments, RMI and RMH can be used reliably to estimate the ℓ_1 distances between vectors in a vector space of low dimensionality.
机译:向量空间模型(VSM)是数学上定义明确的框架,已广泛用于语义的分布方法中。在VSM中。 Ingh维向量表示语言实体。在一个应用程序中,矢量的相似性-以及它们表示的实体-的相似性是通过距离公式计算的。然而,矢量的高维性是采用VSM的方法性能的障碍。因此,采用降维技术来减轻该问题。本文介绍了一种称为“随机曼哈顿索引(RMI)”的新颖技术,用于以降维构造ℓ_1范数VSM。 RMI将VSM的构造和尺寸缩减合并为一个增量的,因此可扩展的两步过程。为了实现其目标,RMI使用了稀疏的柯西随机投影。我们进一步介绍了随机曼哈顿整数索引(RMII):RMI的计算增强版本。如所报道的实验所示,RMI和RMH可以可靠地用于估计低维向量空间中向量之间的ℓ_1距离。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号