首页> 外文会议>International Conference on Computational Aspects of Social Networks >Implementing quasi-parallel breadth-first search in MapReduce for large-scale social network mining
【24h】

Implementing quasi-parallel breadth-first search in MapReduce for large-scale social network mining

机译:在MapReduce实现Qualspard Bread-First搜索大规模社交网络挖掘

获取原文

摘要

Online social networks like Weibo and Twitter consist of billions of users and connections, and traditional approaches which are based on serial algorithms and leveraged only a single node or even a single core cannot suffice the that scale of data any more. We propose new distributed quasi-parallel breadth-first search scheme, the common graph traversal algorithm, based on the MapReduce framework, which has better performance (up to one scale of magnitude less time complexity for single-source cases or even better for multiple-source cases) than Pegasus, the state-of-the-art graph mining library, in terms of the complexity of computation and the I/O load. We apply our algorithms on the Weibo dataset, crawled from its website, which contains 135 million users and 10.2 billion directed connections among them, and occupies up to 400 gigabytes. The dataset is by far the largest one of online social networks in research. Based on the Weibo dataset with extremely skewed degree distribution, we give the empirical time complexity and I/O load analysis in each iteration of our proposed methods. Also, We ran the experiments on a 20-node Hadoop cluster to validate our analysis, and the results conform to our predicted empirical results.
机译:像微博和Twitter这样的在线社交网络由数十亿用户和连接组成,以及基于串行算法的传统方法,仅限单个节点甚至单个节点,不得不足够的数据比例。我们提出了新的分布式准平行广度宽第一搜索方案,即基于MapReduce框架的公共图形遍历算法,其具有更好的性能(对于单源案例的单源案例的时间复杂程度越大,甚至更好 - 源箱)在计算和I / O负载的复杂性方面,源案例比Pegasus,最先进的图形挖掘库。我们在Weibo DataSet上应用我们的算法,从其网站上爬行,其中包含1.35亿用户和102亿个有关的连接,占用高达400千兆字节。数据集是迄今为止在线社交网络中最大的一个。基于具有极其偏斜程度分布的微博数据集,我们在我们提出的方法的每次迭代中提供了经验复杂性和I / O负载分析。此外,我们在20-Node Hadoop集群上运行实验以验证我们的分析,结果符合我们预测的经验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号