会员中心     
首页 > 资料专栏 > 论文 > 财税论文 > 金融投资论文 > MBA毕业论文_数据驱动的消费金融违约风险预测方法研究DOC

MBA毕业论文_数据驱动的消费金融违约风险预测方法研究DOC

weiyuen
V 实名认证
内容提供者
资料大小:3771KB(压缩后)
文档格式:DOC
资料语言:中文版/英文版/日文版
解压密码:m448
更新时间:2021/5/31(发布于广东)

类型:金牌资料
积分:--
推荐:免费申请

   点此下载 ==>> 点击下载文档


文本描述
消费是最终需求,促进消费对释放内需潜力、推动经济转型升级、保障和改 善民生具有重要意义。基于此,商业银行、消费金融公司以及互联网金融企业在 开展传统个人金融业务的同时,积极拓展信用卡、消费信贷和 P2P 借贷等多样化 消费金融业务,助力推动消费市场不断扩大、消费结构持续优化。 近年来,随着“互联网+”战略的深入发展,海量金融数据爆发式增长,使 得信用数据呈现复杂性、多样性、异构性等特点,传统的金融数据分析方法多是 采用模型驱动的策略,无法有效应对个人违约风险预测问题,导致信用违约事件 频发,各类金融机构均承受着违约风险。鉴于此,亟需通过引入最新的机器学习 算法,完善个人违约风险预警机制,促进消费金融市场健康、可持续发展,这对 于丰富和完善消费金融信用风险管理体系具有重要的理论意义和实践价值。 本文在对现有消费金融与违约风险的理论方法进行总结的基础上,凝练了消 费信用数据所存在的非均衡样本、小数据以及高维特征等问题,系统研究了多场 景下数据驱动的消费金融违约风险预测方法,充分运用深度学习算法,构建了基 于异质集成学习、特征迁移学习以及集成深度学习的消费金融违约风险预测方法, 通过实验对比分析验证了所提方法的准确性,最终解决了信用数据所呈现的问题。 本文的具体研究内容和创新点如下: (1)基于异质集成学习的信用卡违约风险预测研究。分析了信用卡消费数 据的非均衡样本对个人违约风险预测的显著影响,提出了一种能够克服非均衡样 本问题的渐进式异质集成学习框架;构建了基于 XGBoost、神经网络和逻辑回归 算法的信用卡违约风险预测个体分类器,并研究了基于排序特征和离散特征的缺 失值处理策略;在此基础上,构建了基于非均衡样本的信用卡违约风险预测方法。 使用包括 12,000 组样本、122 维特征的信用卡消费数据开展了对比实验研究,结 果显示基于非均衡样本的信用卡违约风险预测方法与对比方法相比,具有较佳的 预测精度,并且能够很好的解决非均衡样本问题。 (2)基于特征迁移学习的消费信贷违约风险预测研究。剖析了消费信贷新 获客时冷启动对违约风险预测所导致欠拟合的现象,提出了一种能够解决小数据 问题的特征迁移学习框架,设计了面向特征和样本的相似度估计算法,迁移了与 消费信贷业务相似的部分信用卡数据;构建了基于 GBDT、XGBoost 和 LightGBM 算法的消费信贷违约风险预测个体分类器;在此基础上,提出了基于小数据的消 费信贷违约风险预测方法。使用包括 40,000 组信用卡样本和 4,000 条消费信贷数 据所组成的消费信用数据开展了对比实验研究,结果表明基于小数据的消费信贷IV 违约风险预测方法比基准方法具有较高的 AUC 性能得分和敏感度指标评分,并 且能够很好的解决小数据问题。 (3)基于集成深度学习的 P2P 借贷违约风险预测研究。分析了 P2P 借贷信 用数据呈现高维特征对违约风险预测造成维数灾难的情况,提出了一种能够应对 高维特征问题的集成深度学习框架;构建了基于深度神经网络算法的 P2P 借贷 违约风险预测分类器,采用了随机搜索策略对超参数进行优化,以此设计并配置 完成了网络的内部结构;同时,研究了信用数据中的非均衡样本现象,提出了深 度神经网络模型的 Bagging 集成策略;在此基础上,构建了面向高维特征的 P2P 借贷违约风险预测方法。使用包括 15,000 组样本、1,138 维特征的 P2P 借贷信用 数据开展了对比实验研究,结果显示面向高维特征的 P2P 借贷违约风险预测方 法与对比模型相比较,可以正确区分违约客户,并且能够很好的解决高维特征问 题。 综上所述,当前我国消费金融领域整体风险水平可控,但是作为新兴的消费 金融形式其经营时间较短,风控建模水平有限,坏账控制能力还有待时间验证; 并且多头借贷、恶意骗贷等信用风险和欺诈风险始终是消费金融领域面临的挑战, 风险控制仍将是消费金融企业未来不变的主题。鉴于此,全文以数据风控作为消 费金融风控体系的基础,融入“数据+算法+风控模型”的思想,能够真正有效的 将风控系统量化衡量,打造真正的智能金融减少人工干预降低风险减少损失。对 于从管理视角丰富和发展消费金融违约风险预测的方法体系,推动消费金融领域 信用风险管理水平的提升,具有重要的理论意义和应用价值。 新一代人工智能技术正在成为引领金融科技革命和产业变革的战略性技术, 需要构建满足跨界融合、人机协同、群智开放等特征的新型风险预警机制,进一 步推动信用卡、消费信贷、P2P 借贷等消费金融服务产品创新。与此同时,随着 互联网应用的不断深入以及人工智能技术的不断进步,文本、图像、音视频、社 交关系等多类型数据即将成为构建客户画像的重要依据,对消费金融违约风险预 测模型的多模态跨媒体感知、融合与推理能力提出新的更高要求。 关键词:违约风险预测;消费金融;非均衡样本;集成学习;小数据;迁移学习; 高维特征;深度学习V Abstract Consumption is the ultimate demand, and promoting consumption is of great significance to unleash the potential of domestic demand, promote economic transformation and upgrading, and ensure and improve people's livelihood. Based on this, while carrying out traditional personal finance business, commercial banks, consumer finance companies and Internet finance enterprises are actively expanding diversified consumer finance businesses such as credit card, consumer credit and P2P lending to help promote the continuous expansion of the consumer market and continuous optimization of the consumption structure. In recent years, with the further development of “Internet+” strategy, massive financial credit data with explosive growth presents the characteristics of complexity, heterogeneity of diversity. But traditional financial data analysis methods are adopting the tactics of model driven, they are unable to effectively deal with personal default risk prediction problem, leading to credit default happening frequently and each kind of financial institutions taking the risk of default. In view of this, it is urgent to improve the early warning mechanism of personal default risk and promote the healthy and sustainable development of the consumer financial market by introducing the latest machine learning algorithm, which has important theoretical significance and practical value for enriching and improving the consumer financial credit risk management system. On the basis of summarizing the existing theoretical methods of consumer finance and default risk, this paper concentrates on the problems of unbalanced sample data, small data and high-dimensional of consumer credit data. In addition, this paper systematically studied the data-driven default risk prediction method of consumer finance under multiple scenarios by full use of the deep learning algorithm and constructed the default risk prediction method of consumer finance based on heterogeneous integrated learning, feature transfer learning and integrated deep learning. The accuracy of the proposed method is verified through comparative experimental analysis, and the problems presented by credit data are finally solved. The specific research content and innovation in this paper are as follows: (1) Research on credit card default risk prediction based on heterogeneous ensemble learning was proposed. This paper analyzed the significant influence ofVI imbalanced samples of credit card consumption data on the prediction of personal default risk, and proposed a progressive heterogeneous ensemble learning framework which can overcome the problem of imbalanced samples. In addition, we constructed an individual classifier for credit card default risk prediction based on XGBoost, neural network and logistic regression algorithm. According to this, a credit card default risk prediction method based on imbalanced samples was constructed. A comparative experimental study was conducted using the credit card consumption data including 12,000 groups of samples and 122 dimensions. The results showed that the credit card default risk prediction method based on imbalanced samples had better prediction accuracy and could solve the problem of imbalanced samples better than the comparison method. (2) Research on default risk prediction of consumer credit based on feature transfer learning was proposed. By analyzing the phenomenon of under-fitting caused by the cold start to the default risk prediction of new customers in consumer credit, we proposed a feature transfer learning framework that can solve the problem of small data, design a similarity estimation algorithm for features and samples, and migrate some credit card data similar to consumer credit business. And we constructed a predicting individual classifiers of consumer credit default risk based on GBDT, XGBoost and LightGBM algorithm. According to this, a consumer credit default risk prediction method based on small data is proposed (3) Research on default risk prediction of P2P lending based on ensemble deep learning