In this paper, we develop a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries in OLAP. Such queries provide the k groups with the highest aggregates to decision makers. Essential support of top-k aggregate queries is lacking in current RDBMSs, which process such queries in a naive and overkill materialize- group-sort scheme, therefore can be prohibitively inefficient. Our new framework is based on two fundamental properties, the Group-Ranking and Tuple-Ranking Principles. The principles dictate group-ordering and tuple-ordering requirement that together guide the query processor toward the optimal aggregate query processing. To realize the requirements, we propose a new execution model and address the challenges of implementing new query operators, enabling efficient top-k aggregate query plans that are both group- aware and rank-aware. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional approach.
展开▼