首页> 外文会议>International Conference on Advances in Social Networks Analysis and Mining >SNAP: Towards a Validation of the Social Network Assembly Pipeline
【24h】

SNAP: Towards a Validation of the Social Network Assembly Pipeline

机译:SNAP:迈向社交网络汇编管道的验证

获取原文

摘要

A key problem for social network analysis is the lack of ground-truth data upon which to validate an analysis. Consider for example community-finding algorithms. The ``communities'' identified by such algorithms are typically justified on the basis of their structural properties, rather than on their ability to recover communities which can be independently verified. A ground truth of actual community data isn't always available and at best only partial ground-truth community information is. However, this problem isn't unique to community-finding algorithms. In previous publications, we introduced an automated Social Network Assembly Pipeline we refer to as SNAP. This is intended for the large scale actor identification, tie interference and strength measurement of social networks from non-relational data sets. In this paper we describe a validation study of SNAP through an intensive user-study of a portion of the individuals in the network. Individuals are asked to validate the network relationships uncovered by SNAP and where misclassified relationships are found, the individuals are interviewed in order to determine the underlying cause of the misclassification. The findings provide feedback on the rules through which relationships are inferred. For instance, it becomes clear that an error in actor identification can result in a propagation of this error though the network relations leading to follow-on relationship misclassifications. Also, we observe how outliers lead to a propagation of error in the inferred network. The results help us validate and invalidate different hypotheses we have about SNAP and suggests domain specific rule-sets for SNAP.
机译:社交网络分析的关键问题是缺乏验证分析的基础数据。考虑例如社区查找算法。通过这种算法识别的“社区”通常是基于其结构性质的理由,而不是他们恢复可以独立验证的社区的能力。实际社区数据的基础事实并不总是可用的,并且只有部分地面真理社区信息是。但是,这个问题并不是社区发现算法的独特。在以前的出版物中,我们介绍了一个自动社交网络组装流水线,我们称为Snap。这是针对非关系数据集的大规模演员识别,捆绑干扰和强度测量社交网络。在本文中,我们描述了通过密集用户研究网络中的一部分个人的验证研究。要求个人验证通过捕捉和发现错误的关系,在发现错误的关系中,受访个人以确定错误分类的潜在原因。调查结果提供了关于所接受关系的规则的反馈。例如,显然,演员识别中的错误可能导致此错误的传播,但是在通往关系错误分类的情况下导致的网络关系。此外,我们观察到转速如何导致推断网络中的错误传播。结果帮助我们验证和无效我们对捕捉的不同假设,并建议域特定规则集进行捕捉。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号