首页> 外文学位 >Graph and geometric algorithms on distributed networks and databases.
【24h】

Graph and geometric algorithms on distributed networks and databases.

机译:分布式网络和数据库上的图形和几何算法。

获取原文
获取原文并翻译 | 示例

摘要

This thesis studies the powers and limits of graph and geometric algorithms when we face with massive data, sometimes distributed on several machines. In this case, efficient algorithms are often required to use sublinear resources, such as time, memory, and communication. Three specific models of our interest are distributed networks, data streams, and communication complexity. In these models, we study lower and upper bounds of many problems motivated by applications in networking and database areas, as follows.;How fast are approximation algorithms on distributed networks? We show lower bounds of many approximation graph algorithms by studying the verification problem, stated as follows. Let H be a subgraph of a network G where each vertex of G knows which edges incident on it are in H. We would like to verify whether H has some properties, e.g., if it is a tree or if it is connected (every node knows in the end of the process whether H has the specified property or not). We would like to perform this verification in a decentralized fashion via a distributed algorithm. The time complexity of verification is measured as the number of rounds of distributed communication.;In this thesis we initiate a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and s -- t cut verification.;We then use this these results to derive strong unconditional time lower bounds on the hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Moreover, our unconditional lower bound of approximating minimum spanning tree (MST) subsumes and improves upon the previous hardness of approximation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed approximation algorithm for MST that is significantly faster than the current exact algorithm, for any approximation factor.;In this thesis, we study the skyline problem on the multi-pass streaming model. This model is the most restricted (least powerful) among many models that differentiates between sequential and random accesses as algorithms on this model are allowed to do a random access only once after each round of sequential accesses to the whole input.;We show that, even in this very restricted model, one can still get an efficient algorithm. Our algorithm uses space only enough to store the skyline and uses linear sequential and logarithmic random accesses (or passes in the terminology of data streams). To the best of our knowledge, it is the first randomized skyline algorithm in the literature. We also prove that this our algorithm is near-optimal.;Additionally, we show that the same idea can be used to develop a distributed algorithm that is near optimal. We also show that the algorithm can handle partially ordered domains on each attribute. Finally, we demonstrate the robustness of this algorithm via extensive experiments on both real and synthetic datasets. Our algorithm is comparable to the existing algorithms in average case and additionally tolerant to simple modifications of the data, while other algorithms degrade considerably with such variations. (Abstract shortened by UMI.)
机译:当我们面对海量数据(有时分布在多台机器上)时,本论文研究了图形和几何算法的能力和局限性。在这种情况下,通常需要高效的算法来使用次线性资源,例如时间,内存和通信。我们感兴趣的三个特定模型是分布式网络,数据流和通信复杂性。在这些模型中,我们研究了网络和数据库领域中应用程序引发的许多问题的上下限,如下所示:分布式网络上的近似算法有多快?通过研究验证问题,我们展示了许多近似图算法的下界,如下所述。令H为网络G的子图,其中G的每个顶点都知道入射在H上的哪些边。我们想验证H是否具有某些属性,例如,它是否是树或是否已连接(每个节点)在过程结束时知道H是否具有指定的属性)。我们希望通过分布式算法以分散的方式执行此验证。验证的时间复杂度用分布式通信的回合数来衡量。;本文中,我们对分布式验证进行了系统的研究,并针对诸如连接性等许多基本问题,对分布式验证算法的运行时间给出了几乎严格的下限。 ,跨连接子图和s-t割验证;然后,我们使用这些结果得出许多经典优化问题(包括最小生成树,最短路径和最小割)的强大的无条件时间下界。这些结果中的许多是精确和近似分布式计算的第一个非平凡的下界,它们解决了以前的未解决的问题。而且,我们的无条件近似最小生成树(MST)的下界包含并改善了Elkin的近似界以前的硬度[STOC 2004]以及Peleg和Rubinovich [精确的] MST计算的下界[FOCS 1999] 。我们的结果表明,对于任何近似因子,MST都不存在比当前精确算法快得多的分布式近似算法。在本论文中,我们研究了多通道流模型上的天际线问题。该模型是许多模型中限制最大(最不强大)的模型,它区分顺序访问和随机访问,因为对该模型的算法在每次对整个输入进行顺序访问之后,仅允许对其进行一次随机访问。即使在这种非常受限的模型中,仍然可以得到一种有效的算法。我们的算法仅使用足够的空间来存储天际线,并使用线性顺序和对数随机访问(或以数据流的术语传递)。据我们所知,它是文献中第一个随机的天际线算法。我们还证明了我们的算法是接近最优的。此外,我们证明了相同的思想可以用于开发接近最优的分布式算法。我们还展示了该算法可以处理每个属性上的部分有序域。最后,我们通过在真实和合成数据集上的大量实验证明了该算法的鲁棒性。我们的算法在一般情况下可以与现有算法相提并论,此外还可以容忍对数据的简单修改,而其他算法会因此类变化而严重退化。 (摘要由UMI缩短。)

著录项

  • 作者

    Nanongkai, Danupon.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Applied Mathematics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 214 p.
  • 总页数 214
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号