|
|
Classifying Short Text Complaints with nBD-SVM Model |
Bengong Yu1,2,Yangnan Chen1( ),Ying Yang1,2 |
1(School of Management, Hefei University of Technology, Hefei 230009, China) 2(Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China) |
|
|
Abstract [Objective] This paper tries to find an effective way to classify the non-structured and short-text business complaints, aiming to improve the efficiency of corporate problem solving. [Methods] We first combined the topic model and distributed representation technique to construct a SVM input space vector. Then, we integrated ensemble learning method to build the nBD-SVM text classification model. [Results] We examined the proposed model with business complaint texts and found its precision reached 81.83%, which is much higher than the traditional methods. [Limitations] We only evaluate our model with complaints from one company. [Conclusions] The proposed nBD-SVM model could process short text business complaints effectively.
|
Received: 15 July 2018
Published: 03 July 2019
|
[1] | 梁昕露, 李美娟. 电信业投诉分类方法及其应用研究[J]. 中国管理科学, 2015, 23(S1): 188-192. | [1] | (Liang Xinlu, Li Meijuan.Text Categorization of Complain in Telecommunication Industry and Its Applied Research[J]. Chinese Journal of Management Science, 2015, 23(S1): 188-192.) | [2] | Gao L, Zhou S, Guan J.Effectively Classifying Short Texts by Structured Sparse Representation with Dictionary Filtering[J]. Information Sciences, 2015, 323: 130-142. | [3] | Zhang H, Zhong G.Improving Short Text Classification by Learning Vector Representations of both Words and Hidden Topics[J]. Knowledge-Based Systems, 2016, 102: 76-86. | [4] | Yang L, Li C, Ding Q, et al.Combining Lexical and Semantic Features for Short Text Classification[J]. Procedia Computer Science, 2013, 22: 78-86. | [5] | Wang P, Xu B, Xu J, et al.Semantic Expansion Using Word Embedding Clustering and Convolutional Neural Network for Improving Short Text Classification[J]. Neurocomputing, 2016, 174: 806-814. | [6] | 卢玲, 杨武, 杨有俊, 等. 结合语义扩展和卷积神经网络的中文短文本分类方法[J].计算机应用, 2017, 37(12): 3498-3503. | [6] | (Lu Ling, Yang Wu, Yang Youjun, et al.Chinese Short Text Classification Method by Combining Semantic Expansion and Convolutional Neural Network[J]. Journal of Computer Applications, 2017, 37(12): 3498-3503.) | [7] | 陈培新, 郭武. 融合潜在主题信息和卷积语义特征的文本主题分类[J]. 信号处理, 2017, 33(8): 1090-1096. | [7] | (Chen Peixin, Guo Wu.Document Topic Categorization Combining Latent Topic Information and Convolutional Semantic Features[J]. Journal of Signal Processing, 2007, 33(8): 1090-1096.) | [8] | 王儒, 刘培玉, 王培培. 基于吸引子传播聚类的改进双通道CNN短文本分类算法[J]. 小型微型计算机系统, 2017, 38(8): 1730-1734. | [8] | (Wang Ru, Liu Peiyu, Wang Peipei.Improved Two Channel CNN Short Text Classification Algorithm Based on Affinity Propagation Clustering[J]. Journal of Chinese Computer Systems, 2017, 38(8): 1730-1734.) | [9] | 殷亚博, 杨文忠, 杨慧婷, 等. 基于卷积神经网络和KNN的短文本分类算法研究[J].计算机工程, 2018, 44(7): 193-198. | [9] | (Yin Yabo, Yang Wenzhong, Yang Huiting, et al.Research on Short Text Classification Algorithm Based on Convolutional Neural Network and KNN[J]. Computer Engineering, 2018, 44(7): 193-198.) | [10] | Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. | [11] | 邓淑卿, 徐健. 我国情报学研究主题内容分析[J]. 情报科学, 2017, 35(11): 83-88. | [11] | (Deng Shuqing, Xu Jian.Research Topics and Trends of Information Science in China[J]. Information Science, 2017, 35(11): 83-88.) | [12] | 林萍, 黄卫东. 基于LDA模型的网络突发事件话题演化路径研究[J]. 情报科学, 2014, 32(10): 20-23. | [12] | (Lin Ping, Huang Weidong.Topic Evolution Analysis of Internet Emergency Based on LDA Model[J]. Information Science, 2014, 32(10): 20-23.) | [13] | Yan X, Guo J, Lan Y, et al.A Biterm Topic Model for Short Texts[C]// Proceedings of the 22nd International Conference on World Wide Web. ACM, 2013: 1445-1456. | [14] | 李慧, 王丽婷. 基于词项热度的微博热点话题发现研究[J]. 情报科学, 2018, 36(4): 45-50. | [14] | (Li Hui, Wang Liting.Micro-blog Hot Topic Discovery Based on Heat Term[J]. Information Science, 2018, 36(4): 45-50.) | [15] | 王亚民, 胡悦. 基于BTM的微博舆情热点发现[J]. 情报杂志, 2016, 35(11): 119-124. | [15] | (Wang Yamin, Hu Yue.Hotspot Detection in Microblog Public Opinion Based on Biterm Topic Model[J]. Journal of Intelligence, 2016, 35(11): 119-124.) | [16] | Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781. | [17] | Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 2013 International Conference on Neural Information Processing Systems. 2013: 3111-3119. | [18] | Le Q, Mikolov T.Distributed Representations of Sentences and Documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014: 1188-1196. | [19] | 逯万辉, 谭宗颖. 学术成果主题新颖性测度方法研究——基于Doc2Vec和HMM算法[J]. 数据分析与知识发现, 2018, 2(3): 22-29. | [19] | (Lu Wanhui, Tan Zongying.Measuring Novelty of Scholarly Articles[J]. Data Analysis and Knowledge Discovery, 2018, 2(3): 22-29.) | [20] | 杨宇婷, 王名扬, 田宪允, 等. 基于文档分布式表达的新浪微博情感分类研究[J]. 情报杂志, 2016, 35(2): 151-156. | [20] | (Yang Yuting, Wang Mingyang, Tian Xianyun, et al.Sina Microblog Sentiment Classification Based on Distributed Representation of Documents[J]. Journal of Intelligence, 2016, 35(2): 151-156.) | [21] | Yu C T, Salton G.Precision Weighting—An Effective Automatic Indexing Method[R]. Cornell University, 1975. | [22] | Cortes C, Vapnik V.Support-Vector Networks[J]. Machine Learning, 1995, 20(3): 273-297. | [23] | 周志华. 机器学习[M]. 北京:清华大学出版社, 2016. | [23] | (Zhou Zhihua.Machine Learning[M]. Beijing: Tsinghua University Press, 2016.) | [24] | Breiman L.Bagging Predictors[J]. Machine Learning, 1996, 24(2): 123-140. | [25] | 孙锐, 郭晟, 姬东鸿. 融入事件知识的主题表示方法[J]. 计算机学报, 2017, 40(4): 791-804. | [25] | (Sun Rui, Guo Sheng, Ji Donghong.Topic Representation Integrated with Event Knowledge[J]. Chinese Journal of Computers, 2017, 40(4): 791-804.) | [26] | 刘泽锦, 王洁. 同主题词短文本分类算法中BTM的应用与改进[J]. 计算机系统应用, 2017, 26(11): 213-219. | [26] | (Liu Zejin, Wang Jie.Application and Improvement of BTM in Short Text Classification Algorithm of the Same Topic[J]. Computer Systems & Applications, 2017, 26(11): 213-219.) |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|