Hadoop has become a worldwide popular open source platform for large data analysis in commercial application and Hadoop distributed file system (HDFS) is the core part of it. However, HDFS cannot be used directly for managing raster data, for the geographic location information is involved. In this paper, we describe the implementation of a tile-based scalable raster data management system based on HDFS. While reserving the basic architecture of HDFS, we reorganize the data structure in block, add some additional metadata, design an index data structure in block, keep an overlapping region between adjacent blocks, and offer a compression option for users. Besides, we provide functions for reading the raster data from HDFS in tile stream. These optimizations match the feature of raster data to the architecture of HDFS. MapReduce Applications can be built on the raster data management system.
展开▼