We study implementation techniques for a parallel query language for nested collections. The language handles collections of three kinds (sets, bags, and sequences), and its expressive power is essentially that of OQL (ODMG93). From the perspective of parallel evaluation, the novelty of such a query language is that it can express nested parallelism, which is naturally associated to nested collections. The first implementation step is a translation into a specially designed algebra for flat sequences, having only flat parallelism: the translation "flattens" the nested parallelism, and we prove that it preserves the asymptotic parallel complexity. The second step consists in an implementation of the sequence algebra on a shared nothing architecture. Here we show that all data communications needed by the sequence algebra operators (with one exception) have a particular communication pattern, called monotone communication. We give a provably optimal algorithm for monotone communications on a shared nothing architecture. Here "optimal" means that for any particular initial and final data layout, its communication cost is absolute minimum (not within a constant factor). To account for the communication costs we chose as shared nothing model the recently proposed LogP model. Finally we report some preliminary experiments of our implementation techniques, on a Log P simulator.
展开▼