With the proliferation of multimedia data, there is increasing need to support the indexing and searching of high dimensional data. Recently, a vector approximation based technique called VA-file has been proposed for indexing high dimensional data. It has been shown that the VA-file is an effective technique compared to the current approaches based on space and data partitioning. The VA-file is an effective technique compared to the current approaches based on space and data partitioning. The VA-file gives good performance especially when the data set is uniformly distributed. Real data sets are not uniformly distributed, are often clustered, and the dimensions of the feature vectors in real data sets are usually correlated. More careful analysis for non-uniform or correlated data is needed for effectively indexing high dimensional data. We propose a solution to these problems and propose the VA~+-file, a new technique for indexing high dimensional data sets based on vector approximations. We conclude with an evalaution of nearest neighbor queries and show that the VA~+-file technique results in significant improvements over the current VA-file approach for several real data sets.
展开▼