【24h】

DIsCO: Dynamic Data Compression in Distributed Stream Processing Systems

机译:DIsCO:分布式流处理系统中的动态数据压缩

获取原文

摘要

Supporting high throughput in Distributed Stream Processing Systems (DSPSs) has been an important goal in recent years. Current works either focus on automatically increasing the system resources whenever the current setup is inadequate or apply load shedding techniques discarding some of the incoming data. However, both approaches have significant shortcomings as they require on the fly application reconfiguration where the application needs to be stopped and re-uploaded in the cluster with the new configurations, and can lead to significant information loss. One approach that has not yet been considered for improving the throughput of DSPSs is exploiting compression algorithms to minimize the communication overhead between components especially in cases where we have large-sized data like live CCTV camera reports. This work is the first that provides a novel framework, built on top of Apache Storm, which enables dynamic compression of incoming streaming data. Our approach uses a profiling algorithm to automatically determine the compression algorithm that should be applied and supports both lossless and lossy compression techniques. Furthermore, we propose a novel algorithm for determining when profiling should be applied. Finally, our detailed experimental evaluation with commonly used stream processing applications, indicates a clear improvement on the applications' throughput when our proposed techniques are applied.
机译:近年来,在分布式流处理系统(DSPS)中支持高吞吐量已成为一个重要目标。 Current的工作要么专注于在当前设置不足时自动增加系统资源,要么应用丢弃某些输入数据的减载技术。但是,这两种方法都存在严重的缺陷,因为它们需要快速进行应用程序重新配置,在这种情况下,需要停止应用程序并使用新配置在群集中重新上载应用程序,并且可能导致严重的信息丢失。尚未考虑用于提高DSPS吞吐量的一种方法是利用压缩算法来最大程度地减少组件之间的通信开销,特别是在我们拥有大型数据(如实时CCTV摄像机报告)的情况下。这项工作是第一个提供基于Apache Storm构建的新颖框架的框架,该框架能够动态压缩传入的流数据。我们的方法使用配置文件算法来自动确定应应用的压缩算法,并支持无损和有损压缩技术。此外,我们提出了一种新颖的算法,用于确定何时应应用性能分析。最后,我们对常用流处理应用程序进行的详细实验评估表明,当应用我们提出的技术时,应用程序的吞吐量有了明显的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号