【24h】

Holistic UDAFs at streaming speeds

机译:整体UDAF以流式传输速度

获取原文

摘要

Many algorithms have been proposed to approximate holistic aggregates, such as quantiles and heavy hitters, over data streams. However, little work has been done to explore what techniques are required to incorporate these algorithms in a data stream query processor, and to make them useful in practice.In this paper, we study the performance implications of using user-defined aggregate functions (UDAFs) to incorporate selection-based and sketch-based algorithms for holistic aggregates into a data stream management system's query processing architecture. We identify key performance bottlenecks and tradeoffs, and propose novel techniques to make these holistic UDAFs fast and space-efficient for use in high-speed data stream applications. We evaluate performance using generated and actual IP packet data, focusing on approximating quantiles and heavy hitters. The best of our current implementations can process streaming queries at OC48 speeds (2x 2.4Gbps).
机译:已经提出了许多算法来近似数据流上的整体集合,例如分位数和沉重的击球手。但是,几乎没有做任何工作来探索将这些算法整合到数据流查询处理器中并使其在实践中有用的技术。在本文中,我们研究了使用用户定义的聚合函数(UDAF)的性能含义。 ),以将基于选择和基于草图的整体聚合算法整合到数据流管理系统的查询处理体系结构中。我们确定了关键的性能瓶颈和折衷方案,并提出了新颖的技术来使这些整体UDAF快速,节省空间地用于高速数据流应用程序。我们使用生成的IP数据包数据和实际的IP数据包数据评估性能,重点关注分位数和重击手的近似值。我们目前最好的实现方式可以以OC48速度(2x 2.4Gbps)处理流查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号