首页> 外文期刊>Foundations and trends in theoretical computer science >Scalable Algorithms for Data and Network Analysis
【24h】

Scalable Algorithms for Data and Network Analysis

机译:数据和网络分析的可扩展算法

获取原文
获取原文并翻译 | 示例
       

摘要

In the age of Big Data, efficient algorithms are now in higher demand more than ever before. While Big Data takes us into the asymptotic world envisioned by our pioneers, it also challenges the classical notion of efficient algorithms: Algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today's problems. It is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation. In this tutorial, I will survey a family of algorithmic techniques for the design of provably-good scalable algorithms. These techniques include local network exploration, advanced sampling, sparsification, and geometric partitioning. They also include spectral graph-theoretical methods, such as those used for computing electrical flows and sampling from Gaussian Markov random fields. These methods exemplify the fusion of combinatorial, numerical, and statistical thinking in network analysis. I will illustrate the use of these techniques by a few basic problems that are fundamental in network analysis, particularly for the identification of significant nodes and coherent clusters/communities in social and information networks. I also take this opportunity to discuss some frameworks beyond graph-theoretical models for studying conceptual questions to understand multifaceted network data that arise in social influence, network dynamics, and Internet economics.
机译:在大数据时代,高效算法的需求比以往任何时候都更高。尽管大数据将我们带入了我们的先驱者所设想的渐近世界,但它也挑战了高效算法的经典概念:根据多项式时间特征,过去被视为高效的算法可能不再足以解决当今的问题。有效的算法应该是可扩展的,这不仅是理想的,而且是必不可少的。换句话说,就问题大小而言,它们的复杂度应接近线性或亚线性。因此,应提高可伸缩性,而不仅仅是多项式时间可计算性,作为表征高效计算的核心复杂性概念。在本教程中,我将调查一系列算法技术,以设计可证明良好的可扩展算法。这些技术包括本地网络探索,高级采样,稀疏化和几何分区。它们还包括频谱图理论方法,例如用于计算电流和从高斯马尔可夫随机场中采样的方法。这些方法体现了网络分析中组合思维,数字思维和统计思维的融合。我将通过一些基本的问题来说明这些技术的使用,这些基本问题是网络分析的基础,特别是在识别社交网络和信息网络中的重要节点和相干集群/社区时。我还借此机会讨论了图论模型之外的一些框架,这些框架用于研究概念性问题,以理解社会影响力,网络动力学和互联网经济学中产生的多方面网络数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号