Agile Query Processing in Statistical Databases: A Process-In-Memory Approach

机译：统计数据库中的敏捷查询处理：一种内存中处理方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Statistical database systems are designed to answer queries on summarized data (or macro data), while queries on raw records are not allowed in such database systems. As macro data can offer aggregate information about the database, it is also an effective way to use statistical queries to provide analytical results in semantic databases. However, traditional statistical databases are proposed for security protection, i.e., hiding the raw records from user queries. Few studies are toward query optimizations on aggregate queries in statistical databases. In this paper, we propose a new process-in-memory (PIM) based processing scheme called agile query for accelerating queries in statistical databases. We present two new designs in the agile query. First, we propose an in-memory index to cache aggregate operators (e.g., sum, min, max, count, and average) in the main memory. The aggregate queries that hit in the in-memory index can be evaluated in the memory and no I/O operation will be incurred. Second, we propose to incrementally update the in-memory operator index so that we can ensure the consistency between the cached data and the original data records. We implement the agile query processing framework on top of MySQL and conduct experiments over various sizes of datasets to compare our design with the traditional method in MySQL. The results show that our proposal achieves up to 9 times higher throughput than MySQL under the skewed Zipf query set, and averagely gets about 2 times higher throughput under the random and uniform distributed queries.

机译：统计数据库系统旨在回答对汇总数据（或宏数据）的查询，而在此类数据库系统中不允许对原始记录进行查询。由于宏数据可以提供有关数据库的汇总信息，因此它也是使用统计查询在语义数据库中提供分析结果的有效方法。但是，提出了用于安全保护的传统统计数据库，即，对用户查询隐藏原始记录。很少有研究针对统计数据库中聚合查询的查询优化。在本文中，我们提出了一种新的基于内存处理（PIM）的处理方案，称为敏捷查询，用于加速统计数据库中的查询。我们在敏捷查询中提出了两种新设计。首先，我们提出了一个内存索引，以将聚合运算符（例如，总和，最小值，最大值，计数和平均值）缓存在主内存中。可以在内存中评估在内存索引中命中的聚合查询，并且不会发生I / O操作。其次，我们建议增量更新内存中的运算符索引，以便我们可以确保缓存的数据与原始数据记录之间的一致性。我们在MySQL之上实现了敏捷查询处理框架，并对各种大小的数据集进行了实验，以将我们的设计与MySQL中的传统方法进行比较。结果表明，在偏斜的Zipf查询集下，我们的建议实现的吞吐量比MySQL高出9倍，而在随机且均匀的分布式查询下，吞吐量平均提高了约2倍。

著录项

来源
《International conference on knowledge science, engineering and management》|2019年|726-738|共13页
会议地点
作者
Shanshan Lu; Peiquan Jin; Lin Mu; Shouhong Wan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Query processing; Statistical database; Processing in memory;

机译：查询处理;统计数据库;内存处理;

相似文献

外文文献
中文文献
专利

1. Bidirectional Database Storage and SQL Query Exploiting RRAM-Based Process-in-Memory Structure [J] . Sun Yuliang, Wang Yu, Yang Huazhong ACM Transactions on Storage . 2018,第1期

机译：双向数据库存储和SQL查询利用基于RRAM的过程内存结构
2. Visualizing Acquisition, Processing, and Network Statistics Through Database Queries [J] . NASA Tech Briefs . 2014,第6期

机译：通过数据库查询可视化采集，处理和网络统计信息
3. An Efficient Approach for Query Processing Over Encrypted Database [J] . Jaafer Al-Saraireh Journal of computer sciences . 2017,第10期

机译：加密数据库查询的一种有效方法
4. Agile Query Processing in Statistical Databases: A Process-In-Memory Approach [C] . Shanshan Lu, Peiquan Jin, Lin Mu, International conference on knowledge science, engineering and management . 2019

机译：统计数据库中的敏捷查询处理：内存过程中的过程
5. Scalable statistical modeling and query processing over large scale uncertain databases [D] . Kanagal Shamanna, Bhargav 2011

机译：大规模不确定数据库的可扩展统计建模和查询处理
6. Concurrent query processing in a GPU-based database system [O] . Hao Li, Yi-Cheng Tu, Bo Zeng 2012

机译：基于GPU的数据库系统中的并发查询处理
7. An Efficient Approach for Query Processing Over Encrypted Database [O] . Jaafer Al-Saraireh 2017

机译：通过加密数据库查询处理的有效方法
8. Precision-Time Tradeoffs: A Paradigm for Processing Statistical Queries on Databases [R] . Srivastava, J. , Rotem, D. 1988

机译：精确时间权衡：处理数据库统计查询的范例

Agile Query Processing in Statistical Databases: A Process-In-Memory Approach

摘要

著录项

相似文献

相关主题

期刊订阅