首页> 外文期刊>Frontiers of computer science in China >Handling query skew in large indexes: a view based approach
【24h】

Handling query skew in large indexes: a view based approach

机译:处理大索引中的查询偏斜:一种基于视图的方法

获取原文
获取原文并翻译 | 示例
           

摘要

Indexing is one of the most important techniques to facilitate query processing over a multi-dimensional dataset. A commonly used strategy for such indexing is to keep the tree-structured index balanced. This strategy reduces query processing cost in the worst case, and can handle all different queries equally well. In other words, this strategy implies that all queries are uniformly issued, which is partially because the query distribution is not possibly known and will change over time in practice. A key issue we study in this work is whether it is the best to fully rely on a balanced tree-structured index in particular when datasets become larger and larger in the big data era. This means that, when a dataset becomes very large, it becomes unreasonable to assume that all data in any subspace are equally important and are uniformly accessed by all queries at the index level. Given the existence of query skew and the possible changes of query skew, in this paper, we study how to handle such query skew and such query skew changes at the index level without sacrifice of supporting any possible queries in a well-balanced tree index and without a high overhead. To tackle the issue, we propose index-view at the index level, where an index-view is a short-cut in a balanced tree-structured index to access objects in the subspaces that are more frequently accessed, and propose a new index-view-centric framework for query processing using index-views in a bottom-up manner. We study index-views selection problem in both static and dynamic setting, and we confirm the effectiveness of our approach using large real and synthetic datasets.
机译:索引是促进多维数据集查询处理的最重要技术之一。这种索引的一种常用策略是保持树状结构索引的平衡。在最坏的情况下,这种策略可以降低查询处理成本,并且可以同样好地处理所有不同的查询。换句话说,此策略意味着所有查询都是统一发出的,部分原因是查询分布可能未知,并且在实践中会随着时间而变化。我们在这项工作中研究的关键问题是,最好是完全依靠平衡的树状结构索引,尤其是在大数据时代数据集变得越来越大的情况下。这意味着,当数据集变得非常大时,假设任何子空间中的所有数据都同等重要并且被索引级别的所有查询统一访问就变得不合理了。考虑到查询偏斜的存在和查询偏斜的可能变化,在本文中,我们研究如何在索引级别处理此类查询偏斜和此类查询偏斜变化,而又不牺牲在平衡良好的树索引中支持任何可能的查询和没有高昂的开销。为了解决这个问题,我们建议在索引级别使用索引视图,其中索引视图是平衡树结构索引中的快捷方式,用于访问子空间中访问频率更高的对象,并提出新的索引-以视图为中心的框架,用于以自下而上的方式使用索引视图进行查询处理。我们研究了静态和动态环境下的索引视图选择问题,并且使用大量的实际数据和综合数据集确认了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号