【24h】

On Sampling Type Distribution from Heterogeneous Social Networks

机译:基于异构社会网络的抽样类型分布

获取原文

摘要

Social network analysis has drawn the attention of many researchers recently. As the advance of communication technologies, the scale of social networks grows rapidly. To capture the characteristics of very large social networks, graph sampling is an important approach that does not require visiting the entire network. Prior studies on graph sampling focused on preserving the properties such as degree distribution and clustering coefficient of a homogeneous graph, where each node and edge is treated equally. However, a node in a social network usually has its own attribute indicating a specific group membership or type. For example, people are of different races or nationalities. The link between individuals from the same or different types can thus be classified to intra- and inter-connections. Therefore, it is important whether a sampling method can preserve the node and link type distribution of the heterogeneous social networks. In this paper, we formally address this issue. Moreover, we apply five algorithms to the real Twitter data sets to evaluate their performance. The results show that respondent-driven sampling works well even if the sample sizes are small while random node sampling works best only under large sample sizes.
机译:社交网络分析最近引起了许多研究人员的关注。随着通信技术的发展,社交网络的规模迅速增长。为了捕获非常大的社交网络的特征,图采样是一种重要的方法,不需要访问整个网络。关于图采样的先前研究着重于保留诸如均质图的度分布和聚类系数之类的属性,其中均等地对待每个节点和边。但是,社交网络中的节点通常具有其自己的属性,该属性指示特定的组成员身份或类型。例如,人们属于不同的种族或国籍。因此,来自相同或不同类型的个体之间的链接可以分为内部连接和内部连接。因此,重要的是抽样方法能否保留异构社交网络的节点和链接类型分布。在本文中,我们正式解决了这个问题。此外,我们将5种算法应用于真实的Twitter数据集,以评估其性能。结果表明,即使样本量很小,响应者驱动的抽样也能很好地工作,而随机节点抽样仅在样本量大的情况下才能发挥最佳作用。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号