首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Distributed Garbage Collection Algorithms for Timestamped Data
【24h】

Distributed Garbage Collection Algorithms for Timestamped Data

机译:带时间戳的数据的分布式垃圾收集算法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

There is an important class of interactive multimedia applications that deals with stream data from distributed sources. Indexing the data temporally facilitates ordering individual streams as well as correlating items from different streams. The Stampede programming system organizes stream data into channels that are distributed and synchronized data structures that contain timestamped items. A Stampede program is a data flow graph of threads and channels. Stampede semantics for channels allow concurrent access from multiple threads for input and output. While a channel holds timestamped items, the semantics do not place any restriction on either the production or consumption order of these items. Furthermore, timestamps of items in a channel need not be contiguous. These flexibilities are required due to the dynamic and parallel structure of stream-oriented applications targeted by the Stampede system. Under such circumstances, a key issue is the "garbage collection” (GC) of channel items. In this paper, we present and compare three different GC algorithms: 1) REF is a simple algorithm that keeps a reference count on individual items; 2) TGC is a distributed algorithm for computing a global low watermark for timestamp values of interest in the entire application; 3) DGC is another distributed algorithm that uses information about the dependencies between the producers and consumers of data streams to compute a low water mark local to each node of the data flow graph. DGC can simultaneously eliminate garbage from channels and unneeded computations from threads. In tests performed using an interactive application, DGC enjoys nearly 30 percent reduction in the application memory footprint compared to TGC and REF. DGC and REF are also shown to be more scalable compared to TGC.
机译:一类重要的交互式多媒体应用程序处理来自分布式源的流数据。在时间上索引数据有助于对单个流进行排序以及将来自不同流的项目相关联。 Stampede编程系统将流数据组织到通道中,这些通道是包含时间戳项的分布式和同步数据结构。 Stampede程序是线程和通道的数据流图。通道的踩踏语义允许从多个线程进行并发访问以进行输入和输出。虽然通道保存带时间戳的项目,但语义对这些项目的生产或消费顺序没有任何限制。此外,频道中项目的时间戳不必是连续的。由于Stampede系统所针对的面向流的应用程序具有动态和并行的结构,因此需要这些灵活性。在这种情况下,关键问题是渠道项目的“垃圾收集”(GC),在本文中,我们介绍并比较了三种不同的GC算法:1)REF是一种简单的算法,可对单个项目进行引用计数; 2 )TGC是一种分布式算法,用于计算整个应用程序中感兴趣的时间戳值的全局低水印; 3)DGC是另一种分布式算法,使用有关数据流生产者和消费者之间的依存关系的信息来计算本地低水印到数据流图的每个节点,DGC可以同时消除通道中的垃圾和线程中不必要的计算,在使用交互式应用程序进行的测试中,与TGC和REF相比,DGC的应用程序内存占用减少了近30%。与TGC相比,它们的可伸缩性也更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号