The present invention relates to an undersampling-based ensemble method for resolving data imbalance. In the present invention, the steps of dividing into multiple categories (normal companies) and minority categories (basic companies) based on large number of corporate insolvency data, forming a set of sub-instances by undersampling, and reducing information loss of a sub-group for the population To measure, the steps of measuring the similarity between the data of the population and the data of the subgroup, the step of learning each subgroup using a basic learner and constructing an ensemble, and the performance of each classifier using a test set for verification. It is characterized by including the step of evaluating and measuring the statistical significance of their performance differences.
展开▼