首页> 外文会议>IEEE International Conference on Big Data >ImVerde: Vertex-Diminished Random Walk for Learning Imbalanced Network Representation
【24h】

ImVerde: Vertex-Diminished Random Walk for Learning Imbalanced Network Representation

机译:ImVerde:用于减少不平衡网络表示的顶点减少的随机游动

获取原文

摘要

Imbalanced data widely exist in many high-impact applications. An example is in air traffic control, where among all three types of accident causes, historical accident reports with `personnel issues' are much more than the other two types (`aircraft issues' and `environmental issues') combined. Thus, the resulting data set of accident reports is highly imbalanced. On the other hand, this data set can be naturally modeled as a network, with each node representing an accident report, and each edge indicating the similarity of a pair of accident reports. Up until now, most existing work on imbalanced data analysis focused on the classification setting, and very little is devoted to learning the node representations for imbalanced networks. To bridge this gap, in this paper, we first propose Vertex-Diminished Random Walk (VDRW) for imbalanced network analysis. It is significantly different from the existing Vertex Reinforced Random Walk by discouraging the random particle to return to the nodes that have already been visited. This design is particularly suitable for imbalanced networks as the random particle is more likely to visit the nodes from the same class, which is a desired property for learning node representations. Furthermore, based on VDRW, we propose a semi-supervised network representation learning framework named ImVerde for imbalanced networks, where context sampling uses VDRW and the limited label information to create node-context pairs, and balanced-batch sampling adopts a simple under-sampling method to balance these pairs from different classes. Experimental results demonstrate that ImVerde based on VDRW outperforms state-of-the-art algorithms for learning network representations from imbalanced data.
机译:不平衡的数据广泛存在于许多具有高影响力的应用程序中。空中交通管制就是一个例子,在这三种类型的事故原因中,具有“人员问题”的历史事故报告远远超过其他两种类型(“飞机问题”和“环境问题”)的总和。因此,事故报告的结果数据集高度不平衡。另一方面,该数据集可以自然地建模为网络,每个节点表示一个事故报告,每个边沿表示一对事故报告的相似性。到目前为止,有关不平衡数据分析的大多数现有工作都集中在分类设置上,而很少致力于学习不平衡网络的节点表示形式。为了弥合这一差距,在本文中,我们首先提出用于减少不平衡网络分析的顶点减少随机游走(VDRW)。通过阻止随机粒子返回到已经访问过的节点,它与现有的“顶点增强随机游走”显着不同。这种设计特别适合于不平衡网络,因为随机粒子更可能访问同一类的节点,这是学习节点表示形式的理想属性。此外,基于VDRW,我们为不平衡网络提出了一个名为ImVerde的半监督网络表示学习框架,其中上下文采样使用VDRW和有限的标签信息来创建节点-上下文对,而平衡批采样则采用简单的欠采样平衡来自不同类别的这些对的方法。实验结果表明,基于VDRW的ImVerde优于从不平衡数据中学习网络表示的最新算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号