...
首页> 外文期刊>ACM Transactions on Internet Technology >Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies
【24h】

Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies

机译:逐步随机索引:维护当地网络依赖性的维度减少

获取原文
获取原文并翻译 | 示例
           

摘要

The vector space model is undoubtedly among the most popular data representation models used in the processing of large networks. Unfortunately, the vector space model suffers from the so-called curse of dimensionality, a phenomenon where data become extremely sparse due to an exponential growth of the data space volume caused by a large number of dimensions. Thus, dimensionality reduction techniques are necessary to make large networks represented in the vector space model available for analysis and processing. Most dimensionality reduction techniques tend to focus on principal components present in the data, effectively disregarding local relationships that may exist between objects. This behavior is a significant drawback of current dimensionality reduction techniques, because these local relationships are crucial for maintaining high accuracy in many network analysis tasks, such as link prediction or community detection. To rectify the aforementioned drawback, we propose Progressive Random Indexing, a new dimensionality reduction technique. Built upon Reflective Random Indexing, our method significantly reduces the dimensionality of the vector space model while retaining all important local relationships between objects. The key element of the Progressive Random Indexing technique is the use of the gain value at each reflection step, which determines how much information about local relationships should be included in the space of reduced dimensionality. Our experiments indicate that when applied to large real-world networks (Facebook social network, MovieLens movie recommendations), Progressive Random Indexing outperforms state-of-the-art methods in link prediction tasks.
机译:毫无疑问,矢量空间模型无疑是用于处理大型网络的最流行的数据表示模型。遗憾的是,矢量空间模型遭受了维度的所谓诅咒,该现象是由于由大量尺寸引起的数据空间量的指数增长,数据变得非常稀疏。因此,需要维度降低技术以使在可用于分析和处理的矢量空间模型中表示的大网络。大多数维度降低技术倾向于专注于数据中存在的主成分,有效地忽视对象之间可能存在的局部关系。这种行为是当前维度减少技术的显着缺点,因为这些局部关系对于在许多网络分析任务中维持高精度,例如链路预测或社区检测。为了纠正上述缺点,我们提出了逐步随机索引,一种新的维度减少技术。基于反射随机索引,我们的方法显着降低了矢量空间模型的维度,同时保留了对象之间的所有重要局部关系。逐行随机索引技术的关键要素是在每个反射步骤中使用增益值,这决定了局部关系的信息量将包括在减小的维度的空间中。我们的实验表明,当应用于大型现实网络(Facebook社交网络,Movielens电影推荐)时,逐行随机索引优于链路预测任务中的最先进方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号