Efficient top-k query processing in highly distributed environments is useful but challenging. This paper focuses on the problem over vertically partitioned data and aims to propose efficient algorithms with lower communication cost. Two new algorithms, DBPA and BulkDBPA, are proposed in this paper. DBPA is a direct extension of the centralized algorithm BPA2 into distributed environments. Absorbing the advantage of low data access of BPA2, DBPA has the advantage of low data transfer, though it requires a lot of communication round trips which greatly affect the response time of the algorithm. BulkDBPA improves DBPA by utilizing bulk read and bulk transfer mechanism which can significantly reduce its round trips. Experimental results show that DBPA and BulkDBPA require much less data transfer than SA and TPUT, and BulkDBPA outperforms the other algorithms on overall performance. We also analyze the effect of different parameters on query performance of BulkDBPA and especially investigate the setting strategies of the bulk size.
展开▼