首页> 外文OA文献 >BlendDB : blending table layouts to support efficient browsing of relational databases
【2h】

BlendDB : blending table layouts to support efficient browsing of relational databases

机译:BlendDB:混合表格布局,以支持有效浏览关系数据库

摘要

The physical implementation of most relational databases follows their logical description, where each relation is stored in its own file or collection of files on disk. Such an implementation is good for queries that filter or aggregate large portions of a single table, and provides reasonable performance for queries that join many records from one table to another. It is much less ideal, however, for join queries that follow paths from a small number of tuples in one table to small collections of tuples in other tables to accumulate facts about a related collection of objects (e.g., co-authors of a particular author in a publications database), since answering such queries involves one or more random I/Os per table involved in the path. If the primary workload of a database consists of many such path queries, as is likely to be the case when supporting browsing-oriented applications, performance will be quite poor. This thesis focuses on optimizing the performance of these kinds of path queries in a system called BlendDB, a relational database that supports on-disk co-location of tuples from different relations. To make BlendDB efficient, the thesis will propose a clustering algorithm that, given knowledge of the database workload, co-locates the tuples of multiple relations if they join along common paths. To support the claim of improved performance, the thesis will include experiments in which BlendDB provides better performance than traditional relational databases on queries against the IMDB movie dataset. Additionally, this thesis will show that BlendDB provides commensurate performance to materialized views while using less disk space, and can achieve better performance than materialized views in exchange for more disk space when users navigate between related items in the database.
机译:大多数关系数据库的物理实现遵循其逻辑描述,其中每个关系存储在其自己的文件中或磁盘上的文件集合中。这样的实现方式对于过滤或汇总单个表的大部分的查询非常有用,并且对于将许多记录从一个表连接到另一个表的查询提供了合理的性能。但是,对于遵循从一个表中的少量元组到其他表中的元组的小集合的路径进行联接查询以累积有关对象的相关集合的事实(例如,特定作者的共同作者)的联接查询而言,它就不那么理想了。在出版物数据库中),因为回答此类查询涉及路径中涉及的每个表一个或多个随机I / O。如果数据库的主要工作负载由许多此类路径查询组成(如支持面向浏览的应用程序时很可能如此),则性能将非常差。本文的重点是在名为BlendDB的系统中优化这类路径查询的性能,该系统是一个关系数据库,支持来自不同关系的元组在磁盘上的共置。为了使BlendDB高效,本文将提出一种聚类算法,在了解数据库工作负载的情况下,如果多个关系的元组沿着共同的路径加入,则它们会共处一处。为了支持改进性能的主张,本文将进行一些实验,其中BlendDB在针对IMDB电影数据集的查询上提供的性能优于传统的关系数据库。此外,本论文将表明BlendDB在使用更少的磁盘空间的同时为物化视图提供了相称的性能,并且当用户在数据库中的相关项目之间导航时,与物化视图相比,它可以实现比物化视图更好的性能,以换取更多的磁盘空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号