Classification of Short Texts Based on nLD-SVM-RF Model
Bengong Yu1,2,Yumeng Cao1(),Yangnan Chen1,Ying Yang1,2
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education,Hefei University of Technology, Hefei 230009, China
[Objective] This paper addresses the issue of data sparseness due to short texts, which also improves the performance of short texts classification.[Methods] We proposed a multi-channel text model for the input of short text classifier by integrating the semantics, word order features and topic features. Then, we created the classification method named nLD-SVM-RF with the help of SVM and random forest algorithms. Finally, we examined the new model with short text of complaints.[Results] We compared the performance of our new model with the SVM and RF single classifiers using Doc2vec as the feature. When n =5, the accuracy of the nLD-SVM-RF method increased by 9.70% and 6.25%, respectively.[Limitations] The experimental data size needs to be expanded.[Conclusions] The nLD-SVM-RF model provides a practical solution for the business community to analyse short texts and improve decision-making.

Key wordsShort Text Classification      Multi-Channel Modelling      SVM      Random Forest      Ensemble Learning      nLD-SVM-RF     
Received: 03 July 2019      Published: 14 March 2020
Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model. Data Analysis and Knowledge Discovery, 2020, 4(1): 111-120.

Framework of the nLD-SVM-RF Model
Multi-channel Feature Fusion of Short Text
Doc2Vec Model
Text Modeling Comparison of LDA, Doc2Vec, LD Multi-channel Short Text Features
Comparison of LD-KNN, LD-DecisionTree, LD-SVM, LD-RF, LD-SVM-RF Classification Effects
Comparison of Classification Effects of n = 1, 3, 5, 7, 9
