Classifying Short Text Complaints with nBD-SVM Model
Bengong Yu1,2,Yangnan Chen1(),Ying Yang1,2
1(School of Management, Hefei University of Technology, Hefei 230009, China) 2(Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China)
[Objective] This paper tries to find an effective way to classify the non-structured and short-text business complaints, aiming to improve the efficiency of corporate problem solving. [Methods] We first combined the topic model and distributed representation technique to construct a SVM input space vector. Then, we integrated ensemble learning method to build the nBD-SVM text classification model. [Results] We examined the proposed model with business complaint texts and found its precision reached 81.83%, which is much higher than the traditional methods. [Limitations] We only evaluate our model with complaints from one company. [Conclusions] The proposed nBD-SVM model could process short text business complaints effectively.
余本功,陈杨楠,杨颖. 基于nBD-SVM模型的投诉短文本分类*[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model. Data Analysis and Knowledge Discovery, 2019, 3(5): 77-85.
(Liang Xinlu, Li Meijuan.Text Categorization of Complain in Telecommunication Industry and Its Applied Research[J]. Chinese Journal of Management Science, 2015, 23(S1): 188-192.)
[2]
Gao L, Zhou S, Guan J.Effectively Classifying Short Texts by Structured Sparse Representation with Dictionary Filtering[J]. Information Sciences, 2015, 323: 130-142.
[3]
Zhang H, Zhong G.Improving Short Text Classification by Learning Vector Representations of both Words and Hidden Topics[J]. Knowledge-Based Systems, 2016, 102: 76-86.
[4]
Yang L, Li C, Ding Q, et al.Combining Lexical and Semantic Features for Short Text Classification[J]. Procedia Computer Science, 2013, 22: 78-86.
[5]
Wang P, Xu B, Xu J, et al.Semantic Expansion Using Word Embedding Clustering and Convolutional Neural Network for Improving Short Text Classification[J]. Neurocomputing, 2016, 174: 806-814.
(Lu Ling, Yang Wu, Yang Youjun, et al.Chinese Short Text Classification Method by Combining Semantic Expansion and Convolutional Neural Network[J]. Journal of Computer Applications, 2017, 37(12): 3498-3503.)
(Wang Ru, Liu Peiyu, Wang Peipei.Improved Two Channel CNN Short Text Classification Algorithm Based on Affinity Propagation Clustering[J]. Journal of Chinese Computer Systems, 2017, 38(8): 1730-1734.)
(Yin Yabo, Yang Wenzhong, Yang Huiting, et al.Research on Short Text Classification Algorithm Based on Convolutional Neural Network and KNN[J]. Computer Engineering, 2018, 44(7): 193-198.)
[10]
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
(Lin Ping, Huang Weidong.Topic Evolution Analysis of Internet Emergency Based on LDA Model[J]. Information Science, 2014, 32(10): 20-23.)
[13]
Yan X, Guo J, Lan Y, et al.A Biterm Topic Model for Short Texts[C]// Proceedings of the 22nd International Conference on World Wide Web. ACM, 2013: 1445-1456.
(Wang Yamin, Hu Yue.Hotspot Detection in Microblog Public Opinion Based on Biterm Topic Model[J]. Journal of Intelligence, 2016, 35(11): 119-124.)
[16]
Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[17]
Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 2013 International Conference on Neural Information Processing Systems. 2013: 3111-3119.
[18]
Le Q, Mikolov T.Distributed Representations of Sentences and Documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014: 1188-1196.
(Yang Yuting, Wang Mingyang, Tian Xianyun, et al.Sina Microblog Sentiment Classification Based on Distributed Representation of Documents[J]. Journal of Intelligence, 2016, 35(2): 151-156.)
(Liu Zejin, Wang Jie.Application and Improvement of BTM in Short Text Classification Algorithm of the Same Topic[J]. Computer Systems & Applications, 2017, 26(11): 213-219.)