SchemaSQL is a recently proposed extension to SQL for enabling multi-database interoperability. Several recently identified applications for SchemaSQL, however, mainly rely on its ability to treat data and schema labels in a uniform manner, and call for an efficient implementation of it on a single RDBMS. We first develop a logical algebra for SchemaSQL by combining classical relational algebra with four restructuring operators - unfold, fold, split, and unite - originally introduced in the context of the tabular data model by Gyssens etal. [GLS96], and suitably adapted to fit the needs of SchemaSQL. We give an algorithm for translating SchmeaSQL queries/views involving restructuring, into the logical algebra above. We also provide physical algebraic operators which are useful for query optimization. Using the various operators as a vehicle, we give several alternate implementation strategies for SchemaSQL queries/views. All the proposed strategies can be implemented non-intrusively on top of existion relational DBMS, is that they bo not require any additions to the existing set of plan operators. We conducted a series of performance experiments based on TPC-D benchmark data, using the IBM DB2 DBMS running on Windows/NT. In addition to showing the relative tradeoffs between various alternate strategies, our experiments show the feasibility of implementing SchemaSQL on top of traditional RDBMS is a non-intrusive manner. Furthermore, they also suggest new plan operators which might profitably be added to the existing set available to relational query optimizers, to further boost their performance.
展开▼