#$%^&*AU2019100362A420190509.pdf#####ABSTRACT Nowadays, owing to the reform of the economic system, the personal credit system has played a significant role in increasing the economic efficiency. We aim at building a logistic regression model with Python programming on the basis of current data, so that it is able to evaluate the credit rating of the customers of a finance company automatically when faced with a load of data. Then we found it helpful to better tackle the data with a quicker and more accurate algorithm. The present application relates to a personal credit rating system based on the logistic regression. When it comes to our research process, in the first place, we acquire numerous data collected from a finance company. Next, it came the data preprocessing, consisting of four main steps: feature acquisition, missing value processing, data normalization and feature selection. Afterwards, we embarked on the model building. First, we selected several algorithms, and used the train data to train the model. And by predicting and evaluating the model, we've discovered the optimal model. Eventually, after testing the model, it was able to be applied to cope with the selected data.After testing out several algorithms including K Nearest Neighbors (KNN), Logistic Regression (LR) and Random Forest (RF) with different parameters, we found that LR is the optimal choice which is relatively more accurate and stable, especially when the dimension is equal to 300. We imposed feature selection, starting the dimension at 150 and choose 50 as an interval, then we gained that when selecting 300 features, the Area Under Curve (AUC), accuracy and precision relatively approach to maximum. This approach can be effectively applied to help the finance company to deal with the data with high accuracy and precision, thus it enables it to access the credit rating of its customers and predict the new customers' attributes. In this way, the company can decide whether to give a loan to the customers and the time lag of the loan. 1Data acquisition Acquisition features Data importing -- Missing value handling Data preprocessing Data normalizing feature selection Model building Determining algorithm Model training Model prediction Optimal model (Model evaluation Model application Figure 1 1
展开▼