首页> 外文会议>Distributed computing >Optimal Random Sampling from Distributed Streams Revisited
【24h】

Optimal Random Sampling from Distributed Streams Revisited

机译:再论分布式流的最优随机抽样

获取原文
获取原文并翻译 | 示例

摘要

We give an improved algorithm for drawing a random sample from a large data stream when the input elements are distributed across multiple sites which communicate via a central coordinator. At any point in time the set of elements held by the coordinator represent a uniform random sample from the set of all the elements observed so far. When compared with prior work, our algorithms asymptotically improve the total number of messages sent in the system as well as the computation required of the coordinator. We also present a matching lower bound, showing that our protocol sends the optimal number of messages up to a constant factor with large probability. As a byproduct, we obtain an improved algorithm for finding the heavy hitters across multiple distributed sites.
机译:当输入元素分布在通过中央协调器进行通信的多个站点上时,我们提供了一种改进的算法,可从大型数据流中抽取随机样本。在任何时间点,协调器保存的元素集代表到目前为止观察到的所有元素集中的统一随机样本。与以前的工作相比,我们的算法渐近地改善了系统中发送的消息总数以及协调器所需的计算量。我们还提出了一个匹配的下限,表明我们的协议以最大的概率将最佳数量的消息发送到一个恒定因子。作为副产品,我们获得了一种改进的算法,可用于在多个分布式站点中查找重击手。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号