Multi-instance model has been employed in image retrieval for its excellent performance to deal with the ambiguities in an image. However, many multi-instance learning methods such as Diverse Density and so on cannot meet the requirement of real-time and the retrieval accuracy, so need to be improved. This paper selects instances from regions to words to make the regions full of semantics and become more and more certain. Firstly, it applies Mean Shift to adaptively segment the image. Secondly, it extracts the spatial invariant feature of each region and gets the sparse code. Finally, we apply max-pooling function to the code vector and acquire the feature vector of each instance. At last, we choose MI-SVM as the multi-instance learning method. Experiments illustrate that the precision is improved distinctly and the retrieval time can meet the requirement of real-time.
展开▼