...
首页> 外文期刊>SIGMOD record >Optimistically Compressed Hash Tables & Strings in the USSR
【24h】

Optimistically Compressed Hash Tables & Strings in the USSR

机译:USSR中的乐观压缩哈希表和字符串

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Modern query engines rely heavily on hash tables for query processing. Overall query performance and memory footprint is often determined by how hash tables and the tuples within them are represented. In this work, we propose three complementary techniques to improve this representation: Domain-Guided Prefix Suppression bit-packs keys and values tightly to reduce hash table record width. Optimistic Splitting decomposes values (and operations on them) into (operations on) frequently- and infrequently-accessed value slices. By removing the infrequently-accessed value slices from the hash table record, it improves cache locality. The Unique Strings Self-aligned Region (USSR) accelerates handling frequently occurring strings, which are widespread in real-world data sets, by creating an on-the-fly dictionary of the most frequent strings. This allows executing many string operations with integer logic and reduces memory pressure. We integrated these techniques into Vectorwise. On the TPC-H benchmark, our approach reduces peak memory consumption by 2-4× and improves performance by up to 1.5×. On a real-world BI workload, we measured a 2x improvement in performance and in micro-benchmarks we observed speedups of up to 25 ×.
机译:现代查询引擎严重依赖于哈希表进行查询处理。总体查询性能和内存占用概要通常由哈斯赫表和它们内部的元组决定。在这项工作中,我们提出了三种互补技术来改善此表示:域导向前缀抑制位 - 包密钥和值紧密以减少哈希表记录宽度。乐观拆分将值(和操作)分解为频繁和不经常访问的值切片的(操作上)。通过从哈希表记录中删除不经常访问的值切片,它可以提高缓存局部性。独特的字符串自对齐区域(USSR)加速处理频繁发生的字符串,这些字符串在真实数据集中广泛,通过创建最常见的字符串的在线字典。这允许使用整数逻辑执行许多字符串操作并降低内存压力。我们将这些技术集成到载体上。在TPC-H基准上,我们的方法将峰值记忆消耗降低2-4×,并通过最高1.5倍提高性能。在真实世界的BI工作量上,我们测量了性能和微基准的2倍改善,我们观察到高达25倍的加速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号