The study explores the comparison of various classification models in detecting fraudulent financial statements(FFS).Due to the high-class imbalance in this unique domain,the samples chosen in existing researches tend to be processed not so realistically.Therefore Random Forest is adopted to learn imbalanced data,in addition,sampling with SMOTE.Some more effective measure metrics of performance are also added.The experimental dataset includes 11726 publicly available Chinese financial disclosures from 2007 to 2017,of which 1314 financial statements were accused of fraud by CSRC.The result shows that the Random Forest outperforms other algorithms: Artificial Neural Network(ANN),Logistics Regression(LR),Support Vector Machines(SVM),CART,Decision Trees,Bayesian Networks,Bagging,Stacking and Adaboost.
展开▼