In a database system, careful selection, project, and join (SPJ) optimisation methods are needed to achieve good performance. This is an area of much research in the past two decades, yet much remains to be done. Also, researchers have begun to view data mining as being an integral part of query processing, thus the two are intended to be jointly optimised. Data mining is at one end of the query spectrum and standard SPJ queries are at the other in terms of request definiteness (?). In SPJ queries, the desired result is fully describable ahead of time as one relation, while in data mining the desired result can only be described after the fact, as rules, decision trees, partitions or similar constructs (??). Nonetheless, in both cases the user desires to extract information from relational data and very often the desired information involves both SPJ querying and data mining (e.g., find all association rules on a relation that is the result of an SPJ query on several base relations). In this paper we introduce a mechanism to facilitate efficient SPJ query processing and data mining in a unified fashion. Using a compression method called Peano Trees (P-trees), I/O can be reduced to an absolute minimum (??), indexes can be eliminated entirely and query processing is optimized with data mining effectively.
展开▼