Optimistically Compressed Hash Tables & Strings in the USSR

Tim Gubner; Viktor Leis; Peter Boncz

首页> 外文期刊>SIGMOD record >Optimistically Compressed Hash Tables & Strings in the USSR

【24h】

Optimistically Compressed Hash Tables & Strings in the USSR

机译：USSR中的乐观压缩哈希表和字符串

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Modern query engines rely heavily on hash tables for query processing. Overall query performance and memory footprint is often determined by how hash tables and the tuples within them are represented. In this work, we propose three complementary techniques to improve this representation: Domain-Guided Prefix Suppression bit-packs keys and values tightly to reduce hash table record width. Optimistic Splitting decomposes values (and operations on them) into (operations on) frequently- and infrequently-accessed value slices. By removing the infrequently-accessed value slices from the hash table record, it improves cache locality. The Unique Strings Self-aligned Region (USSR) accelerates handling frequently occurring strings, which are widespread in real-world data sets, by creating an on-the-fly dictionary of the most frequent strings. This allows executing many string operations with integer logic and reduces memory pressure. We integrated these techniques into Vectorwise. On the TPC-H benchmark, our approach reduces peak memory consumption by 2-4× and improves performance by up to 1.5×. On a real-world BI workload, we measured a 2x improvement in performance and in micro-benchmarks we observed speedups of up to 25 ×.

机译：现代查询引擎严重依赖于哈希表进行查询处理。总体查询性能和内存占用概要通常由哈斯赫表和它们内部的元组决定。在这项工作中，我们提出了三种互补技术来改善此表示：域导向前缀抑制位 - 包密钥和值紧密以减少哈希表记录宽度。乐观拆分将值（和操作）分解为频繁和不经常访问的值切片的（操作上）。通过从哈希表记录中删除不经常访问的值切片，它可以提高缓存局部性。独特的字符串自对齐区域（USSR）加速处理频繁发生的字符串，这些字符串在真实数据集中广泛，通过创建最常见的字符串的在线字典。这允许使用整数逻辑执行许多字符串操作并降低内存压力。我们将这些技术集成到载体上。在TPC-H基准上，我们的方法将峰值记忆消耗降低2-4×，并通过最高1.5倍提高性能。在真实世界的BI工作量上，我们测量了性能和微基准的2倍改善，我们观察到高达25倍的加速度。

著录项

来源
《SIGMOD record》 |2021年第1期|60-67|共8页
作者
Tim Gubner; Viktor Leis; Peter Boncz;
展开▼
作者单位

CWI;

FSU Jena;

CWI;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Technical Perspective:Optimistically Compressed Hash Tables & Strings in the USSR [J] . Marcin Zukowski SIGMOD record . 2021,第1期

机译：技术透视：USSR中的乐观压缩哈希表和字符串
2. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors [J] . Janusz M. Bujnicki, Knut Reinert, Kristian Rother, Algorithms . 2009,第2期

机译：使用哈希表，N-gram和字符串描述符对生物分子进行快速结构比对
3. An IP Traceback Protocol using a Compressed Hash Table, a Sinkhole Router and Data Mining based on Network Forensics against Network Attacks [J] . EunHee Jeong, ByungKwan Lee Future generation computer systems . 2014,第apra期

机译：使用压缩哈希表，Sinkhole路由器和基于网络取证的针对网络攻击的数据挖掘的IP回溯协议
4. Efficient Query Processing with Optimistically Compressed Hash Tables Strings in the USSR [C] . Tim Gubner, Viktor Leis, Peter Boncz IEEE International Conference on Data Engineering . 2020

机译：在苏联中使用优化压缩的哈希表和字符串进行高效的查询处理
5. Extraction and prediction of system properties using Variable-N-Gram modeling and Compressive Hashing. [D] . Muthukumarasamy, Muthulakshmi. 2014

机译：使用Variable-N-Gram建模和压缩哈希来提取和预测系统属性。
6. Fast randomized approximate string matching with succinct hash data structures [O] . Alberto Policriti, Nicola Prezza 2015

机译：快速随机近似字符串匹配具有简洁的哈希数据结构
7. Efficient Query Processing with Optimistically Compressed Hash Tables Strings in the USSR [O] . Tim Gubner, Viktor Leis, Peter Boncz 2020

机译：USSR中乐观压缩哈希表和字符串的高效查询处理

Optimistically Compressed Hash Tables & Strings in the USSR

摘要

著录项

相似文献

相关主题

期刊订阅