Although scientific data analysis increasingly requires access and manipulation of large quantities of data, current database technology fails to meet the needs of scientific processing in a number of areas. To overcome acceptance problems among scientific users, database systems must provide performance and functionality comparable to current combinations of scientific programs and file systems. Therefore, we propose extending the concept of a database query to include numeric computation over scientific databases. In this paper, we examine the specification of an integrated algebra that includes traditional database operators for pattern matching and search as well as numeric operators for scientific data sets. Through the use of a single integrated algebra, we can perform automatic optimization on scientific computations, realizing all of the traditional benefits of optimization. We have experimented with a prototype optimizer which integrates sets, time series and spectra data types and operators on those types. Our results demonstrate that scientific database computations using numeric operators on multiple data types can be effectively optimized and permit performance gains that could not be realized without the integration. This research has been performed in collaboration with the Space Grant College at the University of Colorado at Boulder, where the results are being applied to the analysis of experimental data from satellite observations.
展开▼