Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (6): 93-102    DOI: 10.11925/infotech.2096-3467.2020.1273
Current Issue | Archive | Adv Search |
A Capsule Network Model for Text Classification with Multi-level Feature Extraction
Yu Bengong1,2(),Zhu Xiaojie1,Zhang Ziwei1
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
Download: PDF (918 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a structured method to extract text information hierarchically from bottom to top, aiming to improve the performance of existing shallow text classification models. [Methods] We built a MFE-CapsNet model for text classification based on the acquired global and high-level features. The model extracted context information with bidirectional gated recurrent unit (BiGRU). It also introduced the attention coding hidden layer vector to improve feature extraction of the sequence model. We used the capsule network and dynamic routing to obtain high-level aggregated local information and build the MFE-CapsNet model. We also conducted comparative experiment on the performance of our new model. [Results] The F1 values of the MFE-CapsNet model were 96.21%, 94.17%, and 94.19% on the Chinese datasets from three different fields. Our results were at least 1.28, 1.49, and 0.46 percentage points higher than those of the popular text classification methods. [Limitations] We only conducted experiment on three corpora. [Conclusions] The proposed MFE-CapsNet model could effectively extract semantic features and improve the performance of text classification.

Key wordsText Classification      BiGRU      Attention      Capsule Network     
Received: 21 December 2020      Published: 06 July 2021
ZTFLH:  TP391.1  
Fund:National Natural Science Foundation of China(71671057);Open Project of the Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education
Corresponding Authors: Yu Bengong     E-mail:

Cite this article:

Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction. Data Analysis and Knowledge Discovery, 2021, 5(6): 93-102.

URL:     OR

Model Frame Diagram
Global Feature Acquisition Model
Flow Chart of Dynamic Routing Algorithm
数据集 标签 样本量
汽车评论 正向 14 513
负向 14 482
电信投诉 业务规则 4 171
运营管理 4 304
宣传推广 4 977
通讯问题 9 243
头条新闻 文化 1 060
娱乐 1 568
体育 1 540
财经 1 093
房产 700
汽车 1 433
Statistics of Experimental Data
实验参数 参数值
词嵌入维度 300
GRU隐藏单元 128
胶囊数量 10
胶囊维度 16
路由迭代次数 5
优化器 Adam
batch size 64
epoch 20
dropout 0.25
Experimental Parameter Settings
数据集 模型 P R F1
汽车评论 Transformer 90.89 90.88 90.88
TextRNN 91.40 91.37 91.28
GCN 93.25 93.27 93.25
G-Caps 93.81 93.78 93.78
TextRCNN 94.96 94.92 94.93
MFE-CapsNet 96.24 96.22 96.21
电信投诉 Transformer 88.60 88.92 88.61
TextRNN 90.91 90.90 90.05
GCN 91.76 91.47 91.41
G-Caps 92.95 92.29 92.49
TextRCNN 93.98 92.53 92.68
MFE-CapsNet 94.47 94.02 94.17
头条新闻 Transformer 85.99 83.29 83.79
TextRNN 89.71 89.03 89.14
GCN 92.07 92.02 92.02
G-Caps 93.21 92.67 92.83
TextRCNN 93.57 93.97 93.73
MFE-CapsNet 94.42 94.02 94.19
Experimental Results
数据集模型 汽车评论 电信投诉 头条新闻
BiGRU-CapsNet 94.59 94.67 94.39
MFE-CapsNet 96.22 95.05 94.46
Accuracy of BiGRU-CapsNet and MFE-CapsNet
The Influence of Routing Iteration Times on F1 Value
函数数据集 汽车评论 电信投诉 头条新闻
squash1x 95.57 92.76 92.31
squash2x 96.22 95.05 94.46
The Influence of Squeeze Function on Accuracy
[1] 毕达天, 楚启环, 曹冉. 基于文本挖掘的消费者差评意愿的影响因素研究[J]. 情报理论与实践, 2020,43(10):137-143.
[1] (Bi Datian, Chu Qihuan, Cao Ran. Research on the Influencing Factors of Consumer’s Bad Comment Intention Based on Text Mining[J]. Information Studies: Theory & Application, 2020,43(10):137-143.)
[2] 马思丹, 刘东苏. 基于加权Word2vec的文本分类方法研究[J]. 情报科学, 2019,37(11):38-42.
[2] (Ma Sidan, Liu Dongsu. Text Classification Method Based on Weighted Word2vec[J]. Information Science, 2019,37(11):38-42.)
[3] Hinton G E, Krizhevsky A, Wang S D. Transforming Auto-Encoders[C]// Proceedings of the 21st International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2011.
[4] Sabour S, Frosst N, Hinton G E. Dynamic Routing Between Capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 3856-3866.
[5] Zhang N, Deng S, Sun Z, et al. Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 986-992.
[6] Li C, Quan C, Peng L, et al. A Capsule Network for Recommendation and Explaining What You Like and Dislike[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 275-284.
[7] 李钰曼, 陈志泊, 许福. 基于KACC模型的文本分类研究[J]. 数据分析与知识发现, 2019,3(10):89-97.
[7] (Li Yuman, Chen Zhibo, Xu Fu. Classifying Texts with KACC Model[J]. Data Analysis and Knowledge Discovery, 2019,3(10):89-97.)
[8] Katarya R, Arora Y. Study on Text Classification Using Capsule Networks[C]// Proceedings of the 5th International Conference on Advanced Computing & Communication Systems. IEEE, 2019: 501-505.
[9] 刘心惠, 陈文实, 周爱, 等. 基于联合模型的多标签文本分类研究[J]. 计算机工程与应用, 2020,56(14):111-117.
[9] (Liu Xinhui, Chen Wenshi, Zhou Ai, et al. Multi-label Text Classification Based on Joint Model[J]. Computer Engineering and Applications, 2020,56(14):111-117.)
[10] McCallum A, Nigam K. A Comparison of Event Models for Naive Bayes Text Classification[C]// Proceedings of the AAAI-98 Workshop on Learning for Text Categorization. 1998,752(1):41-48.
[11] Joachims T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features[C]// Proceedings of the 10th European Conference on Machine Learning. Springer, Berlin, Heidelberg, 1998: 137-142.
[12] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[13] Zhang X, Zhao J, LeCun Y. Character-level Convolutional Networks for Text Classification[J]. Advances in Neural Information Processing Systems. 2015: 649-657.
[14] Liu P, Qiu X, Huang X. Recurrent Neural Network for Text Classification with Multi-Task Learning[OL]. arXiv Preprint, arXiv:1605.05101.
[15] 朱茂然, 王奕磊, 高松, 等. 中文比较关系的识别:基于注意力机制的深度学习模型[J]. 情报学报, 2019,38(6):612-621.
[15] (Zhu Maoran, Wang Yilei, Gao Song, et al. A Deep-Learning Model Based on Attention Mechanism for Chinese Comparative Relation Detection[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(6):612-621.)
[16] Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489.
[17] Tang D, Qin B, Liu T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1422-1432.
[18] Yang M, Zhao W, Ye J, et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3110-3119.
[19] 冯国明, 张晓冬, 刘素辉. 基于CapsNet的中文文本分类研究[J]. 数据分析与知识发现, 2018,2(12):68-76.
[19] (Feng Guoming, Zhang Xiaodong, Liu Suhui. Classifying Chinese Texts with CapsNet[J]. Data Analysis and Knowledge Discovery, 2018,2(12):68-76.)
[20] 赵琪, 杜彦辉, 芦天亮, 等. 基于capsule-BiGRU的文本相似度分析算法[J/OL]. 计算机工程与应用.
[20] (Zhao Qi, Du Yanhui, Lu Tianliang, et al. Algorithm of Text Similarity Analysis Based on capsule-BiGRU[J/OL]. Computer Engineering and Applications.
[21] Lei K, Fu Q, Yang M, et al. Tag Recommendation by Text Classification with Attention-Based Capsule Network[J]. Neurocomputing, 2020,391:65-73.
doi: 10.1016/j.neucom.2020.01.091
[22] 程艳, 尧磊波, 张光河, 等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J]. 计算机研究与发展, 2020,57(12):2583-2595.
[22] (Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020,57(12):2583-2595.)
[23] 王伟, 孙玉霞, 齐庆杰, 等. 基于BiGRU-attention神经网络的文本情感分类模型[J]. 计算机应用研究, 2019,36(12):3558-3564.
[23] (Wang Wei, Sun Yuxia, Qi Qingjie, et al. Text Sentiment Classification Model Based on BiGRU-attention Neural Network[J]. Application Research of Computers, 2019,36(12):3558-3564.)
[24] Raffel C, Ellis D P W. Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems[OL]. arXiv Preprint, arXiv: 1512.08756.
[25] 余本功, 陈杨楠, 杨颖. 基于nBD-SVM模型的投诉短文本分类[J]. 数据分析与知识发现, 2019,3(5):77-85.
[25] (Yu Bengong, Chen Yangnan, Yang Ying. Classifying Short Text Complaints with nBD-SVM Model[J]. Data Analysis and Knowledge Discovery, 2019,3(5):77-85.)
[26] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[27] Yao L, Mao C, Luo Y. Graph Convolutional Networks for Text Classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019: 7370-7377.
[28] 杨云龙, 孙建强, 宋国超. 基于门控循环单元和胶囊特征的文本情感分析[J]. 计算机应用, 2020,40(9):2531-2535.
[28] (Yang Yunlong, Sun Jianqiang, Song Guochao. Text Sentiment Analysis Based on Gated Recurrent Unit and Capsule Features[J]. Journal of Computer Applications, 2020,40(9):2531-2535.)
[29] Lai S, Xu L, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2673.
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Fan Tao,Wang Hao,Wu Peng. Sentiment Analysis of Online Users' Negative Emotions Based on Graph Convolutional Network and Dependency Parsing[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[4] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[5] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[6] Yang Hanxun, Zhou Dequn, Ma Jing, Luo Yongcong. Detecting Rumors with Uncertain Loss and Task-level Attention Mechanism[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[7] Xie Hao,Mao Jin,Li Gang. Sentiment Classification of Image-Text Information with Multi-Layer Semantic Fusion[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
[8] Yin Pengbo,Pan Weimin,Zhang Haijun,Chen Degang. Identifying Clickbait with BERT-BiGA Model[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[9] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[10] Duan Jianyong,Wei Xiaopeng,Wang Hao. A Multi-Perspective Co-Matching Model for Machine Reading Comprehension[J]. 数据分析与知识发现, 2021, 5(4): 134-141.
[11] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[12] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
[13] Wang Yan, Wang Huyan, Yu Bengong. Chinese Text Classification with Feature Fusion[J]. 数据分析与知识发现, 2021, 5(10): 1-14.
[14] Jiang Cuiqing,Wang Xiangxiang,Wang Zhao. Forecasting Car Sales Based on Consumer Attention[J]. 数据分析与知识发现, 2021, 5(1): 128-139.
[15] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938