【24h】

Fraud Detection by Machine Learning

机译:机器学习欺诈检测

获取原文

摘要

Fraud detection is to find unusual events during screen credit card transactions, insurance claims, account applications, etc. In this paper, five supervised models were built to identify which transactions are fraud. This work gathered the real transaction data from web site. The process of this work included data description, data cleaning, variable creation, feature selection (using filter and wrapper) and modelling. In the process of variable creation, this work created about 220 variables. After feature selection, this work finally selected 40 variables. During modelling process, five machine learning models, including logistic regression, support vector machine, random forest, neural network and boosted tree, were built. The best model turns out to be Boosted Tree, with a 54.3% FDR at 3% cutoff for testing and a 54% FDR at 3% cutoff for OOT. The research significance of this work lies in how to deal with possible risks in credit card transactions in real time. The final research results show that Boosted Tree is the most suitable model for this type of unbalanced large data sets.
机译:欺诈检测是在屏幕信用卡交易,保险索赔,帐户申请等期间找到异常事件。在本文中,建立了五种监督模型,以确定哪些交易是欺诈。这项工作从网站收集了真实的交易数据。本工作的过程包括数据描述,数据清洁,可变创建,功能选择(使用过滤器和包装器)和建模。在变量创建过程中,这项工作创建了大约220个变量。在功能选择之后,这项工作最终选择了40个变量。建立了在建模过程中,建造了五种机器学习模型,包括逻辑回归,支持向量机,随机森林,神经网络和升压树。最佳模型结果是升高的树,54.3%的FDR在3%的截止值下进行测试,54%FDR为3%的OOT。这项工作的研究意义在于如何实时处理信用卡交易中可能的风险。最终的研究结果表明,升级树是这种不平衡大数据集的最合适的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号