We study the communication complexity of evaluating functionswhen the input data is randomly allocated (according to some knowndistribution) amongst two or more players, possibly withinformation overlap. This naturally extends previously studiedvariable partition models such as the best-case and worst-casepartition models. We aim to understand whether the hardness of acommunication problem holds for almost every allocation of theinput, as opposed to holding for perhaps just a few atypicalpartitions. A key application is to the heavily studied data streammodel. There is a strong connection between our communication lowerbounds and lower bounds in the data stream model that are“robust” to the ordering of the data. That is, weprove lower bounds for when the order of the items in the stream ischosen not adversarially but rather uniformly (or near-uniformly)from the set of all permutations. This random-order data streammodel has attracted recent interest, since lower bounds here givestronger evidence for the inherent hardness of streaming problems. Our results include the first random-partition communicationlower bounds for problems including multi-party set disjointnessand gap-Hamming-distance. Both are tight. We also extend andimprove previous results for a form of pointer jumping that isrelevant to the problem of selection (in particular, medianfinding). Collectively, these results yield lower bounds for avariety of problems in the random-order data stream model,including estimating the number of distinct elements, approximatingfrequency moments, and quantile estimation. A short version of this article is available in the Proceedings of the 40th Annual ACM Symposium on Theory of Computing(STOC'08), ACM, pp. 641-650. Compared to the conference presentation,this version considerably expands the detail of the discussionand in the proofs, and substantially changes some of theproof techniques.
展开▼