首页> 外文期刊>Distributed and Parallel Databases >A memory-optimal many-to-many semi-stream join
【24h】

A memory-optimal many-to-many semi-stream join

机译:内存优化的多对多半流联接

获取原文
获取原文并翻译 | 示例

摘要

Semi-stream join algorithms join a fast stream input with a disk-based master data relation. A common class of these algorithms is derived from hash joins: they use the stream as build input for a main hash table, and also include a cache for frequent master data. The composition of the cache is very important for performance; however, the decision of which master data to cache has so far been solely based on heuristics. We present the first formal criterion, a cache inequality that leads to a provably optimal composition of the cache in a semi-stream many-to-many equijoin algorithm. We propose a novel algorithm, Semi-Stream Balanced Join (SSBJ), which exploits this cache inequality to achieve a given service rate with a provably minimal amount of memory for all stream distributions. We present a cost model for SSBJ and compare its service rate empirically and analytically with other related approaches.
机译:半流连接算法将快速流输入与基于磁盘的主数据关系连接在一起。这些算法的常见类别是从哈希联接派生的:它们将流用作主哈希表的构建输入,并且还包括用于频繁使用主数据的缓存。缓存的组成对于性能而言非常重要。但是,到目前为止,要缓存哪些主数据的决定仅基于启发式方法。我们提出了第一个形式标准,即缓存不等式,该不等式导致半流多对等等连接算法中缓存的最佳组合。我们提出了一种新颖的算法,即半流平衡连接(SSBJ),该算法利用此缓存不等式来实现给定的服务速率,并为所有流分配分配可证明的最小内存量。我们提出了SSBJ的成本模型,并与其他相关方法进行了经验和分析比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号