登录    注册    忘记密码    使用帮助

详细信息

Risk prediction model of bank telecommunication fraud based on XGBoost  ( EI收录)  

文献类型:会议论文

英文题名:Risk prediction model of bank telecommunication fraud based on XGBoost

作者:Wu, Siyuan[1]; Yang, Derong[1]; Ge, Wenjun[2]; Chen, Baoqin[1]

机构:[1] School of mathematics and computer, Guangdong Ocean University, Guangdong, Zhanjiang, 524088, China; [2] School of economics, Guangdong Ocean University, Guangdong, Zhanjiang, 524088, China

会议论文集:International Conference on Cyber Security, Artificial Intelligence, and Digital Economy, CSAIDE 2023

会议日期:March 3, 2023 - March 5, 2023

会议地点:Nanjing, China

语种:英文

外文关键词:Classification (of information) - Correlation methods - Data mining - Decision trees - Forecasting - Logistic regression - Machine learning - Statistical tests

外文摘要:The digital economy is booming, but cybercrimes and telecommunication frauds are emerging one after another. How to detect fraudulent behaviours and prevent the occurrence of crimes is a significant challenge. This paper mainly conducts data mining and analysis on the bank card telecommunication fraud data set, first of all, data mining and feature engineering for the given data set, including analyzing the data integrity, the overall statistical analysis of the data and standardizing the data using the Z-Score standardization method, Use the Pearson correlation coefficient to explore the feature correlation, use the SMOTE method to balance the data set, and finally divide the training set and the test set. Subsequently, four machine learning classification models, including the logistic regression classification model, KNN classification model, decision tree classification model and XGBoost classification model, were established to predict and classify fraudulent behaviours preliminarily. To further mine the data set of bank card telecommunication fraud, the optimal solutions of the models are obtained by grid tuning and cross-validation for the four established models. After experiments, the logistic regression classification model, KNN classification model, decision tree classification model and XGBoost classification The prediction accuracy rates of the model in the test set are 93.45%, 99.85%, 99.92%, and 99.94%, respectively. It is preliminarily believed that the XGBoost and decision tree classification models have excellent classification capabilities. Use the obtained four optimal models to calculate the three performance evaluation indicators of prediction accuracy, recall rate and F1 value in the test set, respectively, and further evaluate the four machine learning models. Through comparative analysis, the XGBoost classification model has the best performance. Due to its classification ability, strong generalization ability and robustness, it is selected as the final bank card telecommunication fraud prediction model. In addition, the P-R curve and ROC curve of the classification results are drawn using the performance evaluation indicators to be intuitive. Analysis of the model's performance further shows that XGBoost has better generalization ability. ? 2023 SPIE.

参考文献:

正在载入数据...

版权所有©广东海洋大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心