首页> 中文期刊> 《统计与信息论坛》 >基于不平衡样本的互联网个人信用评估研究

基于不平衡样本的互联网个人信用评估研究

         

摘要

国内互联网金融和消费信贷的迅猛发展,催生了互联网个人征信的巨大需求。针对不平衡的互联网征信数据,采用随机过抽样、随机欠抽样和SMOTE方法进行数据平衡化,并建立决策树、支持向量机和随机森林等分类模型对互联网个人信用评估进行研究,结果表明:互联网大数据背景下的个人信用评估研究具有可行性;过抽样方法可以较好地提高互联网个人信用评估模型的分类性能;构建信用等级较好用户的一般特质,即年龄在18~30岁之间、工资水平在2000元以上、用户页面浏览量多集中在10~20岁之间和申请贷款时间相对较早。在对互联网个人信用评估中变量有效性进行探索的基础上,反驳了“采用的变量越多结果就越准确”的说法。%With the rapid development of the internet financeand consumer credit,it has given rise to the huge demand for internet personal credit reporting.Based on imbalanced of internet credit reporting data,we used the over-sampling,under-sampling and SMOTE,then established the decision tree and support vector machine and random forest model,selected F-measure and AUC value to evaluate the models and digs out the general feature of high credit rating.Our results found that the credit assessment is feasible under the background of the internet big data,and the over-sampling method improves the classification of the model.We found that the general feature of high credit rating is the age-group of 18-30,the wage levels range from more than 2,000 yuan per month,10-20 times page views and loan early. Under the variable effectiveness research,we effectively avoid variable involving user privacy information.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号