A Capsule Network Model for Text Classification with Multi-level Feature Extraction
Yu Bengong1,2(),Zhu Xiaojie1,Zhang Ziwei1
1School of Management, Hefei University of Technology, Hefei 230009, China 2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
[Objective] This paper proposes a structured method to extract text information hierarchically from bottom to top, aiming to improve the performance of existing shallow text classification models. [Methods] We built a MFE-CapsNet model for text classification based on the acquired global and high-level features. The model extracted context information with bidirectional gated recurrent unit (BiGRU). It also introduced the attention coding hidden layer vector to improve feature extraction of the sequence model. We used the capsule network and dynamic routing to obtain high-level aggregated local information and build the MFE-CapsNet model. We also conducted comparative experiment on the performance of our new model. [Results] The F1 values of the MFE-CapsNet model were 96.21%, 94.17%, and 94.19% on the Chinese datasets from three different fields. Our results were at least 1.28, 1.49, and 0.46 percentage points higher than those of the popular text classification methods. [Limitations] We only conducted experiment on three corpora. [Conclusions] The proposed MFE-CapsNet model could effectively extract semantic features and improve the performance of text classification.
余本功,朱晓洁,张子薇. 基于多层次特征提取的胶囊网络文本分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction. Data Analysis and Knowledge Discovery, 2021, 5(6): 93-102.
(Bi Datian, Chu Qihuan, Cao Ran. Research on the Influencing Factors of Consumer’s Bad Comment Intention Based on Text Mining[J]. Information Studies: Theory & Application, 2020,43(10):137-143.)
(Ma Sidan, Liu Dongsu. Text Classification Method Based on Weighted Word2vec[J]. Information Science, 2019,37(11):38-42.)
[3]
Hinton G E, Krizhevsky A, Wang S D. Transforming Auto-Encoders[C]// Proceedings of the 21st International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2011.
[4]
Sabour S, Frosst N, Hinton G E. Dynamic Routing Between Capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 3856-3866.
[5]
Zhang N, Deng S, Sun Z, et al. Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 986-992.
[6]
Li C, Quan C, Peng L, et al. A Capsule Network for Recommendation and Explaining What You Like and Dislike[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 275-284.
(Li Yuman, Chen Zhibo, Xu Fu. Classifying Texts with KACC Model[J]. Data Analysis and Knowledge Discovery, 2019,3(10):89-97.)
[8]
Katarya R, Arora Y. Study on Text Classification Using Capsule Networks[C]// Proceedings of the 5th International Conference on Advanced Computing & Communication Systems. IEEE, 2019: 501-505.
(Liu Xinhui, Chen Wenshi, Zhou Ai, et al. Multi-label Text Classification Based on Joint Model[J]. Computer Engineering and Applications, 2020,56(14):111-117.)
[10]
McCallum A, Nigam K. A Comparison of Event Models for Naive Bayes Text Classification[C]// Proceedings of the AAAI-98 Workshop on Learning for Text Categorization. 1998,752(1):41-48.
[11]
Joachims T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features[C]// Proceedings of the 10th European Conference on Machine Learning. Springer, Berlin, Heidelberg, 1998: 137-142.
[12]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[13]
Zhang X, Zhao J, LeCun Y. Character-level Convolutional Networks for Text Classification[J]. Advances in Neural Information Processing Systems. 2015: 649-657.
[14]
Liu P, Qiu X, Huang X. Recurrent Neural Network for Text Classification with Multi-Task Learning[OL]. arXiv Preprint, arXiv:1605.05101.
(Zhu Maoran, Wang Yilei, Gao Song, et al. A Deep-Learning Model Based on Attention Mechanism for Chinese Comparative Relation Detection[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(6):612-621.)
[16]
Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489.
[17]
Tang D, Qin B, Liu T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1422-1432.
[18]
Yang M, Zhao W, Ye J, et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3110-3119.
(Zhao Qi, Du Yanhui, Lu Tianliang, et al. Algorithm of Text Similarity Analysis Based on capsule-BiGRU[J/OL]. Computer Engineering and Applications. http://kns.cnki.net/kcms/detail/11.2127.TP.20200826.1635.010.html.)
[21]
Lei K, Fu Q, Yang M, et al. Tag Recommendation by Text Classification with Attention-Based Capsule Network[J]. Neurocomputing, 2020,391:65-73.
doi: 10.1016/j.neucom.2020.01.091
(Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020,57(12):2583-2595.)
(Wang Wei, Sun Yuxia, Qi Qingjie, et al. Text Sentiment Classification Model Based on BiGRU-attention Neural Network[J]. Application Research of Computers, 2019,36(12):3558-3564.)
[24]
Raffel C, Ellis D P W. Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems[OL]. arXiv Preprint, arXiv: 1512.08756.
(Yu Bengong, Chen Yangnan, Yang Ying. Classifying Short Text Complaints with nBD-SVM Model[J]. Data Analysis and Knowledge Discovery, 2019,3(5):77-85.)
[26]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[27]
Yao L, Mao C, Luo Y. Graph Convolutional Networks for Text Classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019: 7370-7377.
(Yang Yunlong, Sun Jianqiang, Song Guochao. Text Sentiment Analysis Based on Gated Recurrent Unit and Capsule Features[J]. Journal of Computer Applications, 2020,40(9):2531-2535.)
[29]
Lai S, Xu L, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2673.