首页> 外文会议>Conference on empirical methods in natural language processing >Random Manhattan Integer Indexing: Incremental L_1 Normed Vector Space Construction
【24h】

Random Manhattan Integer Indexing: Incremental L_1 Normed Vector Space Construction

机译:随机曼哈顿整数索引:增量L_1规范矢量空间施工

获取原文

摘要

Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in the distributional approaches to semantics. In VSMs. Ingh-dimensional vectors represent linguistic entities. In an application, the similarity of vectors-and thus the entities that they represent-is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a novel technique called Random Manhattan Indexing (RMI) for the construction of ?_1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incremental and thus scalable two-step procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections. We further introduce Random Manhattan Integer Indexing (RMII): a computationally enhanced version of RMI. As shown in the reported experiments, RMI and RMH can be used reliably to estimate the ?_1 distances between vectors in a vector space of low dimensionality.
机译:矢量空间模型(VSM)是数学上定义的框架,已广泛用于语义的分布方法。在VSM中。 Ingh维矢量代表语言实体。在应用中,矢量的相似性 - 因此它们代表的实体 - 由距离公式计算。然而,载体的高维度是雇用VSM的方法的障碍。因此,使用维度减少技术来缓解这个问题。本文介绍了一种称为随机曼哈顿指数(RMI)的新技术,用于构建α_1规范的VSM,减少维度。 RMI将VSM和尺寸减少的构造结合成增量,从而可扩展的两步过程。为了实现其目标,RMI采用稀疏的Cauchy随机投影。我们进一步介绍了随机曼哈顿整数索引(RMII):计算上的RMI版本。如报道的实验所示,RMI和RMH可以可靠地使用以估计在低维度的矢量空间中的载体之间的Δ_1距离。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号