首页> 美国卫生研究院文献>PLoS Clinical Trials >Estimation of Global Network Statistics from Incomplete Data
【2h】

Estimation of Global Network Statistics from Incomplete Data

机译:从不完整数据估计全球网络统计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week.
机译:复杂的网络是各种各样的社会,生物,物理和虚拟系统的基础。复杂网络科学的一个巨大复杂之处在于,在大多数情况下,观察所有节点和所有网络交互是不可能的。以前处理部分网络数据影响的工作出乎意料地受到限制,主要集中在丢失的节点上,并且表明从子采样数据得出的网络统计信息不适用于描述整个网络拓扑的同一网络统计信息。我们仅根据节点,链接或权重的部分知识即可生成缩放方法,以预测真实的网络统计信息,包括程度分布。我们的方法是透明的,无需假设网络的已知生成过程,因此可以预测各种应用的网络统计信息。我们验证四种模拟网络类别和各种规模的经验数据集的分析结果。我们通过改变采样数据的比例来进行子采样实验,并证明我们的缩放方法可以在确认限制的同时提供对真实网络统计信息的很好的估计。最后,我们将我们的技术应用于一组丰富且不断发展的大型社交网络,即Twitter回复网络。基于1亿条推文,我们使用缩放技术提出了Twitter Interactome从2008年9月至2008年11月的统计特征。我们的处理方法使我们能够找到对Dunbar假设的支持,以检测出活跃的社交联系数量上限个人维持一周的时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号