首页> 外文期刊>ACM transactions on the web >From Footprint to Evidence: An Exploratory Study of Mining Social Data for Credit Scoring
【24h】

From Footprint to Evidence: An Exploratory Study of Mining Social Data for Credit Scoring

机译:从足迹到证据:探索社会数据进行信用评分的探索性研究

获取原文
获取原文并翻译 | 示例

摘要

With the booming popularity of online social networks like Twitter and Weibo, online user footprints are accumulating rapidly on the social web. Simultaneously, the question of how to leverage the large-scale user-generated social media data for personal credit scoring comes into the sight of both researchers and practitioners. It has also become a topic of great importance and growing interest in the P2P lending industry. However, compared with traditional financial data, heterogeneous social data presents both opportunities and challenges for personal credit scoring. In this article, we seek a deep understanding of how to learn users' credit labels from social data in a comprehensive and efficient way. Particularly, we explore the social-databased credit scoring problem under the micro-blogging setting for its open, simple, and real-time nature. To identify credit-related evidence hidden in social data, we choose to conduct an analytical and empirical study on a large-scale dataset from Weibo, the largest and most popular tweet-style website in China. Summarizing results from existing credit scoring literature, we first propose three social-data-based credit scoring principles as guidelines for in-depth exploration. In addition, we glean six credit-related insights arising from empirical observations of the testbed dataset. Based on the proposed principles and insights, we extract prediction features mainly from three categories of users' social data, including demographics, tweets, and networks. To harness this broad range of features, we put forward a two-tier stacking and boosting enhanced ensemble learning framework. Quantitative investigation of the extracted features shows that online socialmedia data does have good potential in discriminating good credit users from bad. Furthermore, we perform experiments on the real-world Weibo dataset consisting of more than 7.3 million tweets and 200,000 users whose credit labels are known through our third-party partner. Experimental results show that (i) our approach achieves a roughly 0.625 AUC value with all the proposed social features as input, and (ii) our learning algorithm can outperform traditional credit scoring methods by as much as 17% for social-data-based personal credit scoring.
机译:随着Twitter和Weibo等在线社交网络的迅猛发展,在线用户足迹在社交网站上迅速积累。同时,研究人员和从业人员都想到了如何利用大型用户生成的社交媒体数据进行个人信用评分的问题。在P2P借贷行业中,它也已成为非常重要的话题,并且引起了越来越多的关注。但是,与传统的财务数据相比,异构的社会数据既给个人信用评分带来了机遇,也带来了挑战。在本文中,我们寻求对如何以一种全面而有效的方式从社交数据中学习用户的信用标签的深刻理解。特别是,我们以开放,简单和实时的性质探讨了微博设置下基于社会数据的信用评分问题。为了识别隐藏在社交数据中的与信用相关的证据,我们选择对来自中国最大,最受欢迎的推特风格网站微博的大规模数据集进行分析和实证研究。总结现有信用评分文献的结果,我们首先提出三种基于社会数据的信用评分原则,作为深入探索的指南。此外,我们从对试验床数据集的经验观察中获得了六种与信用相关的见解。基于提出的原理和见解,我们主要从三类用户的社会数据中提取预测特征,包括人口统计,推文和网络。为了利用这些广泛的功能,我们提出了两层堆栈并增强了增强的集成学习框架。对提取特征的定量研究表明,在线社交媒体数据确实有很好的潜力,可以区分良好信用用户和不良信用用户。此外,我们对真实世界的微博数据集进行了实验,该数据集包含超过730万条推文和200,000个用户,这些用户的信用标签通过我们的第三方合作伙伴获悉。实验结果表明,(i)我们的方法在所有提议的社会特征作为输入的情况下均达到了约0.625的AUC值;(ii)对于基于社交数据的个人,我们的学习算法可以比传统的信用评分方法高出17%信用评分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号