Hash tables on external memory are commonly used for indexing in database management systems. In this paper we present an algorithm that, in an asymptotic sense, achieves the best possible I/O and space complexities. Let B denote the number of records that fit in a block, and let N denote the total number of records. Our hash table uses 1+O(1/ÖB)1+O(1/sqrt{B}) I/Os, expected, for looking up a record (no matter if it is present or not). To insert, delete or change a record that has just been looked up requires 1+O(1/ÖB)1+O(1/sqrt{B}) I/Os, amortized expected, including I/Os for reorganizing the hash table when the size of the database changes. The expected external space usage is 1+O(1/ÖB)1+O(1/sqrt{B}) times the optimum of N/B blocks, and just O(1) blocks of internal memory are needed.
展开▼