An application which operates on an imbalanced dataset loses its classification performance on a minority class, which is rare and important. There are a number of over-sampling techniques, which insert minority instances into a dataset, to adjust the class distribution. Unfortunately, these instances highly affect the computation of generating a classifier. In this paper, a new simple and effective under-sampling called MUTE is proposed. Its strategy is to get rid of noise majority instances which over-lap with minority instances. The removal majority instances are considered based on their safe levels relying on the Safe-Level-SMOTE concept. MUTE not only reduces the classifier construction time because of a downsizing dataset but also improves the prediction rate on a minority class. The experimental results show that MUTE improves F-measure by comparing to SMOTE techniques.
展开▼