Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (10): 65-76    DOI: 10.11925/infotech.2096-3467.2018.0026
Current Issue | Archive | Adv Search |
Predicting Credit Risks of P2P Loans in China Based on Ensemble Learning Methods
Cao Wei, Li Can(), He Tingting, Zhu Weidong
School of Economics, Hefei University of Technology, Hefei 230601, China
Download: PDF (944 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper examines several popular ensemble-learning methods with real-world data, aiming to find the most suitable way to monitor the P2P credit risks facing China. [Methods] We extracted the borrower’s features from five aspects, and identified the most remarkable ones with Random Forest method. Then, we compared the prediction models based on four ensemble-learning methods and five base classifiers. [Results] We found that the Rotation Forest method had the highest accuracy rate of 99.32% and the lowest error rate of 1.71% . Feature selection processing based on Random Forest could improve the performance of all related models significantly. [Limitations] The sample dataset needs to be expanded. [Conclusions] The proposed method could identify credit risks more effectively.

Key wordsEnsemble Learning      Feature Select      P2P Net Loan      Credit Risks     
Received: 09 January 2018      Published: 12 November 2018
ZTFLH:  F832.4 G35  

Cite this article:

Cao Wei,Li Can,He Tingting,Zhu Weidong. Predicting Credit Risks of P2P Loans in China Based on Ensemble Learning Methods. Data Analysis and Knowledge Discovery, 2018, 2(10): 65-76.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0026     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I10/65

变量类型 变量名 实际含义 变量数值化
因变量 label 借款违约与否 违约=1, 未违约=0
借款人特征信息 F1 年龄 20-25岁=0, 26-31岁=1, 32-37岁=2, 38-43岁=3, 44-49岁=4, 50岁及以上=5
F2 学历 高中及以下=0, 大专=1, 本科=2, 研究生=3
F3 婚姻状况 单身(包括未婚、离异和丧偶)=0, 已婚=1
F4 工作时间 空值=0, 1年及以下=2, 1-3年(含)=4, 3-5年(含)=6, 5年以上=8
F5 工作城市 东部=0, 中部=1, 西部=2
F6 公司行业 借款人所在公司所属行业*
F7 公司规模 空值=0, 10人以下=1, 10-100人=2, 100-500人=3, 500人以上=4
借款人财务信息 F8 收入 1000元以下=0, 1000-2000元=1, 2000-5000元=2, 5000-1000元=3, 10000=20000元=4, 20000-50000元=5, 50000元以上=6
F9 信用等级 HR=0, E=1, D=2, C=3, B=4, A=5, AA=6
F10 信用额度 信用额度做Min-Max标准化处理
F11 房产 有房产=1, 无房产=0
F12 车产 有车产=1, 无车产=0
F13 房贷 无房贷=1, 有房贷=0
F14 车贷 无车贷=1, 有车贷=0
借款人历史信息 F15 成功借款 借款人成功借款数量
F16 申请借款 借款人历史申请借款笔数
F17 逾期次数 借款人历史逾期次数
F18 严重逾期 存在严重逾期=1, 否则=0
借款特征 F19 借款金额 借款人预期借款金额做Min-Max标准化处理
F20 用途 借款人的借款用途**
F21 利率 借款年利率
F22 还款期限 借款期限, 按月衡量, 最短3个月, 最长36个月
F23 标的类型 机构担保标=0, 信用认证标=1, 实地认证标=2
平台认证
信息
F24 信用认证 借款人提供央行开具的个人征信报告, 认证通过=1, 其他=0
F25 身份认证 借款人提供身份证复印件认证身份信息, 认证通过=1, 其他=0
F26 工作认证 借款人提供工作证复印件或劳动合同, 认证通过=1, 其他=0
F27 收入认证 借款人提供收入证明或工资卡银行流水, 认证通过=1, 其他=0
变量类型 变量 含义
借款人特征信息 F1 年龄
F3 婚姻状况
借款人财务信息 F9 信用等级
F10 信用额度
借款人历史信息 F17 逾期次数
F18 严重逾期
借款特征 F19 借款金额
F22 还款期限
平台认证信息 F26 工作认证
F27 收入认证
预测为违约 预测为未违约
实际为违约 TP FN
实际为未违约 FP TN
集成算法 基分类器 准确率(%) Type-I error (%) AUC
A FS A FS A FS
Bagging LR 97.17 98.79 7.47 6.71 0.987 0.989
CART 98.08 98.58 6.32 2.87 0.936 0.999
C4.5 97.67 98.28 6.90 2.87 0.949 0.998
MLP 96.06 98.08 10.92 8.05 0.988 0.998
SVM 97.47 97.97 9.20 6.32 0.973 0.995
Boosting LR 97.98 98.48 5.74 3.45 0.984 0.898
CART 97.17 98.88 6.32 5.17 0.983 0.994
C4.5 97.17 98.07 5.17 4.02 0.995 0.996
MLP 97.27 98.88 8.05 2.30 0.995 0.999
SVM 97.97 98.28 8.62 4.59 0.954 0.971
Random
Subspace
LR 95.55 97.27 15.52 8.05 0.984 0.996
CART 96.46 97.17 4.59 2.87 0.940 0.981
C4.5 95.05 98.07 6.32 4.02 0.961 0.994
MLP 95.45 97.67 8.62 7.47 0.965 0.997
SVM 96.87 97.37 9.77 7.47 0.955 0.963
Rotation Forest LR 98.68 99.69 3.45 0.57 0.998 1.000
CART 98.48 99.19 3.45 1.15 0.992 0.998
C4.5 97.97 98.99 6.90 5.17 0.954 0.996
MLP 98.07 99.29 6.32 1.15 0.997 1.000
SVM 98.78 99.79 4.59 0.00 0.975 0.999
集成算法 基分类器 准确率(%) Type-I error (%) AUC
A FS A FS A FS
Bagging LR 98.65 99.46 6.03 2.59 0.990 0.998
CART 97.71 98.65 6.03 5.17 0.977 0.996
C4.5 97.71 98.65 12.93 5.17 0.992 0.998
MLP 96.36 98.38 10.34 5.17 0.992 0.998
SVM 98.52 98.92 5.17 1.72 0.974 0.994
Boosting LR 97.98 98.48 5.74 3.45 0.984 0.898
CART 97.71 99.59 5.17 1.72 0.984 0.999
C4.5 97.30 98.38 6.03 3.45 0.981 0.997
MLP 97.30 99.32 7.76 3.44 0.996 0.999
SVM 98.52 98.65 4.31 2.59 0.974 0.981
Random
Subspace
LR 97.71 98.79 7.41 4.31 0.988 0.994
CART 96.23 97.04 12.93 5.17 0.975 0.976
C4.5 96.36 97.57 12.07 3.45 0.957 0.997
MLP 96.77 97.57 12.93 4.31 0.993 0.995
SVM 97.98 98.25 7.76 4.31 0.956 0.994
Rotation Forest LR 98.65 99.59 4.31 0.86 0.998 1.000
CART 98.11 98.65 3.44 1.72 0.993 0.995
C4.5 98.65 99.05 6.03 3.45 0.985 0.999
MLP 97.84 99.46 7.76 2.59 0.995 0.999
SVM 98.92 99.73 4.31 1.72 0.976 1.000
集成算法 基分类器 准确率(%) Type-I error (%) AUC
A FS A FS A FS
Bagging LR 97.17 98.99 7.32 4.90 0.962 0.990
CART 97.17 98.78 8.53 6.09 0.968 0.994
C4.5 97.57 98.58 9.76 4.90 0.995 0.997
MLP 96.56 98.79 8.53 6.09 0.989 0.985
SVM 96.37 98.38 9.76 3.66 0.976 0.986
Boosting LR 96.56 97.57 13.41 6.09 0.966 0.995
CART 97.37 97.77 8.53 3.66 0.995 0.997
C4.5 95.14 97.36 8.53 4.90 0.972 0.980
MLP 97.37 98.58 9.76 6.09 0.993 0.996
SVM 97.37 98.18 7.32 3.66 0.982 0.990
Random Subspace LR 94.13 97.36 20.73 8.53 0.966 0.994
CART 96.56 96.96 14.63 9.76 0.940 0.997
C4.5 96.56 97.16 12.20 10.98 0.964 0.975
MLP 95.95 97.97 8.53 7.32 0.968 0.991
SVM 96.56 97.77 14.63 9.76 0.967 0.983
Rotation Forest LR 98.38 99.39 6.09 1.20 0.994 1.000
CART 97.57 99.19 6.09 1.20 0.992 0.996
C4.5 97.57 99.19 7.32 1.20 0.981 0.998
MLP 98.38 99.19 7.32 2.44 0.992 1.000
SVM 98.58 99.39 6.09 1.20 0.981 1.000
训练
测试比
Bagging Boosting Random Subspace Rotation Forest
A FS A FS A FS A FS
60:40 97.29 98.34 97.51 98.52 95.88 97.51 98.40 99.39
70:30 97.79 98.81 97.71 98.92 97.01 97.84 98.43 99.30
80:20 96.97 98.70 96.76 97.89 95.95 97.44 98.10 99.27
平均值 97.35 98.62 97.32 98.44 96.27 97.60 98.31 99.32
训练
测试比
Bagging Boosting Random Subspace Rotation Forest
A FS A FS A FS A FS
60:40 8.16 5.36 6.78 3.91 8.96 5.98 4.94 1.61
70:30 8.10 4.00 7.24 3.27 10.62 4.31 5.17 2.07
80:20 8.78 5.13 9.51 4.88 14.14 9.27 6.58 1.45
平均值 8.35 4.82 7.84 4.02 11.24 6.52 5.56 1.71
标准 Bagging Boosting Random Subspace Rotation
Forest
A FS A FS A FS A FS
准确率 6.00 2.67 6.33 2.50 8.00 5.67 3.83 1.00
Type-I error 6.33 3.33 6.33 2.00 8.00 5.00 4.00 1.00
[1] Korol T.Early Warning Models Against Bankruptcy Risk for Central European and Latin American Enterprises[J]. Economic Modelling, 2013, 31(1): 22-30.
doi: 10.1016/j.econmod.2012.11.017
[2] 储蕾. 基于BP神经网络和SVM的个人信用评估比较研究[D]. 上海: 上海师范大学, 2014.
[2] (Chu Lei.The Comparative Research of Personal Credit Assessment Model Based on BP Neural Network and SVM[D]. Shanghai: Shanghai Normal University, 2014.)
[3] Serrano-Cinca C, Gutiérrez-Nieto B.The Use of Profit Scoring as an Alternative to Credit Scoring Systems in Peer-to-Peer (P2P) Lending[J]. Decision Support Systems, 2016, 89: 113-122.
doi: 10.1016/j.dss.2016.06.014
[4] Dahiya S, Handa S S, Singh N P.A Feature Selection Enabled Hybrid-Bagging Algorithm for Credit Risk Evaluation[J]. Expert Systems, 2017, 34(9): e12217.
doi: 10.1111/exsy.12217
[5] Xia Y, Liu C, Da B, et al.A Novel Heterogeneous Ensemble Credit Scoring Model Based on Stacking Approach[J]. Expert Systems with Applications, 2018, 93: 182-199.
doi: 10.1016/j.eswa.2017.10.022
[6] Sun J, Lang J, Fujita H, et al.Imbalanced Enterprise Credit Evaluation with DTE-SBD: Decision Tree Ensemble Based on SMOTE and Bagging with Differentiated Sampling Rates[J]. Information Sciences, 2018, 425: 76-91.
doi: 10.1016/j.ins.2017.10.017
[7] Zhu Y, Xie C, Wang G J, et al.Comparison of Individual, Ensemble and Integrated Ensemble Machine Learning Methods to Predict China’s SME Credit Risk in Supply Chain Finance[J]. Neural Computing & Applications, 2017, 28(1): 41-50.
[8] He H, Zhang W, Zhang S.A Novel Ensemble Method for Credit Scoring: Adaption of Different Imbalance Ratios[J]. Expert Systems with Applications, 2018, 98: 105-107.
doi: 10.1016/j.eswa.2018.01.012
[9] Sun Z, Song Q, Zhu X, et al.A Novel Ensemble Method for Classifying Imbalanced Data[J]. Pattern Recognition, 2015, 48(5): 1623-1637.
doi: 10.1016/j.patcog.2014.11.014
[10] Xiao H, Xiao Z, Wang Y.Ensemble Classification Based on Supervised Clustering for Credit Scoring[J]. Applied Soft Computing, 2016, 43: 73-86.
doi: 10.1016/j.asoc.2016.02.022
[11] Abellán J, Castellano J G.A Comparative Study on Base Classifiers in Ensemble Method for Credit Scoring[J]. Expert Systems with Applications, 2016, 73: 1-10.
doi: 10.1016/j.eswa.2016.12.020
[12] 梁明江, 庄宇. 集成学习方法在企业财务危机预警中的应用[J]. 软科学, 2012, 26(4): 114-117.
[12] (Liang Mingjiang, Zhuang Yu.Ensemble Learning Method and Its Application in Enterprise Financial Crisis Early Warning[J]. Soft Science, 2012, 26(4): 114-117.)
[13] 李诒靖, 郭海湘, 李亚楠, 等. 一种基于Boosting的集成学习算法在不均衡数据中的分类[J]. 系统工程理论与实践, 2016, 36(1): 189-199.
[13] (Li Yijing, Guo Haixiang, Li Ya’nan, et al.A Boosting Based Ensemble Learning Algorithm in Imbalanced Data Classification[J]. Systems Engineering—Theory & Practice, 2016, 36(1): 189-199.)
[14] 王清. 集成学习中若干关键问题的研究[D]. 上海: 复旦大学, 2011.
[14] (Wang Qing.Research on Several Key Problems of Ensemble Learning Algorithms[D]. Shanghai: Fudan University, 2011.)
[15] Dietterich T G.An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization[J]. Machine Learning, 2000, 40(2): 139-157.
doi: 10.1023/A:1007607513941
[16] Nanni L, Lumini A.An Experimental Comparison of Ensemble Classifiers for Bankruptcy Prediction and Credit Scoring[J]. Expert Systems with Applications, 2009, 36(2): 3028-3033.
doi: 10.1016/j.eswa.2008.01.018
[17] Altman E I. Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy[J]. The Journal of Finance, 1968, 23(4): 589-609.
doi: 10.2307/2978933
[18] 石澄贤, 陈雪交. P2P网贷个人信用评价指标体系的构建[J]. 常州大学学报: 社会科学版, 2016, 17(1): 80-85.
doi: 10.3969/j.issn.2095-042X.2016.01.012
[18] (Shi Chengxian, Chen Xuejiao.The Construction of P2P Network Lending Personal Credit Evaluation Index System[J]. Journal of Changzhou University:Social Science Edition, 2016, 17(1): 80-85.)
doi: 10.3969/j.issn.2095-042X.2016.01.012
[19] 王金珠. 基于证据权重逻辑回归模型的P2P公司信用风险评估[D]. 南京: 南京航空航天大学, 2016.
[19] (Wang Jinzhu. Based on the Weight of Evidence Logistic Regression Model to Assess P2P Company’s Credit Risk[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2016).
[20] Hens A B, Tiwari M K.Computational Time Reduction for Credit Scoring: An Integrated Approach Based on Support Vector Machine and Stratified Sampling Method[J]. Expert Systems with Applications, 2012, 39(8): 6774-6781.
doi: 10.1016/j.eswa.2011.12.057
[21] Zhao Z, Xu S, Kang B H, et al.Investigation and Improvement of Multi-Layer Perceptron Neural Networks for Credit Scoring[J]. Expert Systems with Applications, 2015, 42(7): 3508-3516.
doi: 10.1016/j.eswa.2014.12.006
[22] 余华银, 雷雅慧. 基于决策树与Logistic回归的P2P网贷平台信用风险评价比较分析[J]. 长春大学学报: 社会科学版, 2017, 27(9): 13-16.
[22] (Yu Huayin, Lei Yahui.Comparative Analysis on Credit Risk Evaluation of P2P Network Loan Platform Based on Decision Tree and Logistic Regression[J]. Journal of Changchun University, 2017, 27(9): 13-16.)
[23] 王重仁, 韩冬梅. 基于卷积神经网络的互联网金融信用风险预测研究[J]. 微型机与应用, 2017, 36(24): 44-48.
[23] (Wang Chongren, Han Dongmei.Prediction of Credit Riskin Internet Financial Industry Based on Convolutional Neural Network[J]. Microcomputer &Its Applications, 2017, 36(24): 44-48.)
[24] Abellán J, Mantas C J.Improving Experimental Studies about Ensembles of Classifiers for Bankruptcy Prediction and Credit Scoring[J]. Expert Systems with Applications, 2014, 41(8): 3825-3830.
doi: 10.1016/j.eswa.2013.12.003
[25] Tsai C F, Hsu Y F, Yen D C.A Comparative Study of Classifier Ensembles for Bankruptcy Prediction[J]. Applied Soft Computing, 2014, 24: 977-984.
doi: 10.1016/j.asoc.2014.08.047
[26] Bequé A, Lessmann S.Extreme Learning Machines for Credit Scoring: An Empirical Evaluation[J]. Expert Systems with Applications, 2017, 86: 42-53.
doi: 10.1016/j.eswa.2017.05.050
[27] Ala'raj M, Abbod M F. A New Hybrid Ensemble Credit Scoring Model Based on Classifiers Consensus System Approach[J]. Expert Systems with Applications, 2016, 64: 36-55.
doi: 10.1016/j.eswa.2016.07.017
[28] Florez-Lopez R, Ramon-Jeronimo J M. Enhancing Accuracy and Interpretability of Ensemble Strategies in Credit Risk Assessment: A Correlated-Adjusted Decision Forest Proposal[J]. Expert Systems with Applications, 2015, 42(13): 5737-5753.
doi: 10.1016/j.eswa.2015.02.042
[29] Lin W Y, Hu Y H, Tsai C F.Machine Learning in Financial Crisis Prediction: A Survey[J]. IEEE Transactions on Systems, Man, and Cybernetics, 2012, 42(4): 421-436.
doi: 10.1109/TSMCC.2011.2170420
[30] 薛薇, 陈欢歌. SPSS Modeler数据挖掘方法及应用[M]. 北京: 电子工业出版社, 2014.
[30] (Xue Wei, Chen Huan’ge.SPSS Modeler Data Mining Method and Application[M]. Beijing: Publishing House of Electronics Industry, 2014.)
[31] Breiman L I, Friedman J H, Olshen R A, et al.Classification and Regression Trees (CART)[J]. Encyclopedia of Ecology, 1984, 40(3): 582-588.
[32] Rutkowski L, Jaworski M, Pietruczuk L, et al.The CART Decision Tree for Mining Data Streams[J]. Information Sciences, 2014, 266: 1-15.
doi: 10.1016/j.ins.2013.12.060
[33] Quinlan J R.C4.5: Programs for Machine Learning[M]. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1993.
[34] Lakshmi B N, Indumathi T S, Ravi N.A Study on C.5 Decision Tree Classification Algorithm for Risk Predictions During Pregnancy[J]. Procedia Technology, 2016, 24: 1542-1549.
doi: 10.1016/j.protcy.2016.05.128
[35] Kohavi R, John G H.The Wrapper Approach[A]//Feature Extraction, Construction and Selection[M]. New York: Springer US, 1998: 33-50.
[36] Rumelhart D E, McClelland J L, The PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructures of Cognition[J]. Language, 1987, 63(4): 871-886.
doi: 10.2307/415721
[37] Cortes C, Vapink V.Support Vector Networks[J]. Machine Learning, 1995, 20(3): 273-297.
[38] Sadik O, Land W H, Wanekaya A K, et al.Detection and Classification of Organophosphate Nerve Agent Simulants Using Support Vector Machines with Multiarray Sensors[J]. Journal of Chemical Information and Computer Sciences, 2004, 44(2): 499-507.
doi: 10.1021/ci034220i pmid: 15032529
[39] Kearns M J, Valiant L G.Cryptographic Limitations on Learning Boolean Formulae and Finite Automata[J]. Journal of the Association for Computing Machinery, 1994, 41(1): 433-444.
doi: 10.1007/3-540-56483-7_21
[40] 曹莹, 苗启广, 刘家辰, 等. AdaBoost算法研究进展与展望[J]. 自动化学报, 2013, 39(6): 745-758.
doi: 10.3724/sp.j.1004.2013.00745
[40] (Cao Ying, Miao Qiguang, Liu Jiachen, et al.Advance and Prospects of AdaBoost Algorithm[J]. Acta Automatica Sinica, 2013, 39(6): 745-758.)
doi: 10.3724/sp.j.1004.2013.00745
[41] Breiman L.Arcing Classifiers[J]. The Annals of Statistics, 1998, 26(3): 801-824.
doi: 10.1214/aos/1024691079
[42] Ho T K.The Random Subspace Method for Constructing Decision Forests[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1998, 20(8): 832-844.
[43] Tumer K, Ghosh J.Error Correlation and Error Reduction in Ensemble Classifiers[J]. Connection Science, 1996, 8(3-4): 385-404.
doi: 10.1080/095400996116839
[44] Rodriguez J J, Kuncheva L I, Alonso C J.Rotation Forest: A New Classifier Ensemble Method[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1619-1630.
doi: 10.1109/TPAMI.2006.211 pmid: 16986543
[45] Demšar J.Statistical Comparisons of Classifiers over Multiple Data Sets[J]. The Journal of Machine Learning Research, 2006, 7: 1-30.
doi: 10.1007/s10846-005-9016-2
[46] Piramuthu S.On Preprocessing Data for Financial Credit Risk Evaluation[J]. Expert Systems with Applications, 2006, 30: 489-497.
doi: 10.1016/j.eswa.2005.10.006
[47] Liu Y, Schumann M.Data Mining Feature Selection for Credit-Scoring Models[J]. The Journal of the Operational Research Society, 2005, 56(9): 1099-1108.
doi: 10.1057/palgrave.jors.2601976
[1] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[2] Xu Liangchen, Guo Chonghui. Predicting Survival Rates for Gastric Cancer Based on Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(8): 86-99.
[3] Wang Nan,Li Hairong,Tan Shuru. Predicting of Public Opinion Reversal with Improved SMOTE Algorithm and Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(4): 37-48.
[4] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[5] Qiu Yunfei, Guo Lei. Predicting Diabetic Complications with Unbalanced Data[J]. 数据分析与知识发现, 2021, 5(2): 116-128.
[6] Yu Bengong,Ji Haomin. Semi-Supervised Method for Text Classification Based on DW-TCI[J]. 数据分析与知识发现, 2020, 4(10): 58-69.
[7] Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[8] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[9] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[10] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[11] Lianjie Xiao,Mengrui Gao,Xinning Su. An Under-sampling Ensemble Classification Algorithm Based on Fuzzy C-Means Clustering for Imbalanced Data[J]. 数据分析与知识发现, 2019, 3(4): 90-96.
[12] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[13] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[14] Wen Tingxin,Li Yangzi,Sun Jingshuang. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[15] Li Zhipeng,Li Weizhong. Feature Selection Based on Modified QPSO Algorithm[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn