In this paper, we focus on efficient keyword query processing for XML data based on the SLCA and ELCA semantics. We propose a novel form of inverted lists for keywords which include IDs of nodes that directly or indirectly contain a given keyword. We propose a family of efficient algorithms that are based on the set intersection operation for both semantics. We show that the problem of SLCA/ELCA computation becomes finding a set of nodes that appear in all involved inverted lists and satisfy certain conditions. We also propose several optimization techniques to further improve the query processing performance. We have conducted extensive experiments with many alternative methods. The results demonstrate that our proposed methods outperform previous methods by up to two orders of magnitude in many cases.
展开▼