Searching for all occurrences of a twig pattern in a XML document is an important operation in XML query processing. Recently a class of holistic twig pattern matching algorithms has been proposed. Compared with the prior approaches, the holistic method avoids generating large intermediate results which do not contribute to the final answer. The method is CPU and I/O optimal when twig patterns only have ancestor-descendant relationships.The holistic twig-pattern matching method proposed earlier operates on element streams which cluster all XML elements with the same tag name together. In this paper we introduce a clustering method called Prefix Path Streaming (PPS) and new holistic twig pattern matching algorithms based on PPS. PPS clusters elements of XML documents according to the paths from root to the elements. This clustering approach avoids unnecessary scanning of irrelevant portion of XML documents.More importantly, we develop optimal algorithms based on PPS streaming which can process a large class of twig patterns consisting of both ancestor-descendant and parent-child relationships.
展开▼