Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (6): 93-102     https://doi.org/10.11925/infotech.2096-3467.2020.1273
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于多层次特征提取的胶囊网络文本分类研究*
余本功1,2(),朱晓洁1,张子薇1
1合肥工业大学管理学院 合肥 230009
2合肥工业大学过程优化与智能决策教育部重点实验室 合肥 230009
A Capsule Network Model for Text Classification with Multi-level Feature Extraction
Yu Bengong1,2(),Zhu Xiaojie1,Zhang Ziwei1
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
全文: PDF (918 KB)   HTML ( 13
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 提高现有浅层文本分类模型特征抽取能力,自底向上分层级地提取文本信息,从而提高文本分类效果。【方法】 本文提出一种基于全局特征和高层次特征获取的文本分类模型(MFE-CapsNet),该模型利用双向门控循环单元提取上下文信息,并引入权值注意力编码前后隐层向量,从而提高序列模型特征表示质量。结合胶囊网络利用动态路由获得高层次聚合后的局部信息,构建MFE-CapsNet模型,进行文本分类的对比实验。【结果】 MFE-CapsNet模型在三个不同领域的中文数据集上F1值分别达到96.21%、94.17%、94.19%,对比其他分类方法最少分别提升1.28、1.49、0.46个百分点。【局限】 实验仅在三种语料上进行验证。【结论】 MFE-CapsNet模型利用改进的胶囊网络能够更加全面、深层次地挖掘文本语义特征,提高文本分类性能。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
余本功
朱晓洁
张子薇
关键词 文本分类双向门控循环单元注意力机制胶囊网络    
Abstract

[Objective] This paper proposes a structured method to extract text information hierarchically from bottom to top, aiming to improve the performance of existing shallow text classification models. [Methods] We built a MFE-CapsNet model for text classification based on the acquired global and high-level features. The model extracted context information with bidirectional gated recurrent unit (BiGRU). It also introduced the attention coding hidden layer vector to improve feature extraction of the sequence model. We used the capsule network and dynamic routing to obtain high-level aggregated local information and build the MFE-CapsNet model. We also conducted comparative experiment on the performance of our new model. [Results] The F1 values of the MFE-CapsNet model were 96.21%, 94.17%, and 94.19% on the Chinese datasets from three different fields. Our results were at least 1.28, 1.49, and 0.46 percentage points higher than those of the popular text classification methods. [Limitations] We only conducted experiment on three corpora. [Conclusions] The proposed MFE-CapsNet model could effectively extract semantic features and improve the performance of text classification.

Key wordsText Classification    BiGRU    Attention    Capsule Network
收稿日期: 2020-12-21      出版日期: 2021-07-06
ZTFLH:  TP391.1  
基金资助:*国家自然科学基金项目(71671057);过程优化与智能决策教育部重点实验室开放课题
通讯作者: 余本功     E-mail: bgyu19@163.com
引用本文:   
余本功,朱晓洁,张子薇. 基于多层次特征提取的胶囊网络文本分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction. Data Analysis and Knowledge Discovery, 2021, 5(6): 93-102.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.1273      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I6/93
Fig.1  模型框架
Fig.2  全局特征获取模型结构
Fig.3  动态路由算法流程
数据集 标签 样本量
汽车评论 正向 14 513
负向 14 482
电信投诉 业务规则 4 171
运营管理 4 304
宣传推广 4 977
通讯问题 9 243
头条新闻 文化 1 060
娱乐 1 568
体育 1 540
财经 1 093
房产 700
汽车 1 433
Table 1  实验数据统计
实验参数 参数值
词嵌入维度 300
GRU隐藏单元 128
胶囊数量 10
胶囊维度 16
路由迭代次数 5
优化器 Adam
batch size 64
epoch 20
dropout 0.25
Table 2  实验参数设置
数据集 模型 P R F1
汽车评论 Transformer 90.89 90.88 90.88
TextRNN 91.40 91.37 91.28
GCN 93.25 93.27 93.25
G-Caps 93.81 93.78 93.78
TextRCNN 94.96 94.92 94.93
MFE-CapsNet 96.24 96.22 96.21
电信投诉 Transformer 88.60 88.92 88.61
TextRNN 90.91 90.90 90.05
GCN 91.76 91.47 91.41
G-Caps 92.95 92.29 92.49
TextRCNN 93.98 92.53 92.68
MFE-CapsNet 94.47 94.02 94.17
头条新闻 Transformer 85.99 83.29 83.79
TextRNN 89.71 89.03 89.14
GCN 92.07 92.02 92.02
G-Caps 93.21 92.67 92.83
TextRCNN 93.57 93.97 93.73
MFE-CapsNet 94.42 94.02 94.19
Table 3  实验对比结果(%)
数据集模型 汽车评论 电信投诉 头条新闻
BiGRU-CapsNet 94.59 94.67 94.39
MFE-CapsNet 96.22 95.05 94.46
Table 4  验证模型准确率(%)
Fig.4  路由迭代次数对F1值的影响
函数数据集 汽车评论 电信投诉 头条新闻
squash1x 95.57 92.76 92.31
squash2x 96.22 95.05 94.46
Table 5  挤压函数对准确率的影响(%)
[1] 毕达天, 楚启环, 曹冉. 基于文本挖掘的消费者差评意愿的影响因素研究[J]. 情报理论与实践, 2020,43(10):137-143.
[1] (Bi Datian, Chu Qihuan, Cao Ran. Research on the Influencing Factors of Consumer’s Bad Comment Intention Based on Text Mining[J]. Information Studies: Theory & Application, 2020,43(10):137-143.)
[2] 马思丹, 刘东苏. 基于加权Word2vec的文本分类方法研究[J]. 情报科学, 2019,37(11):38-42.
[2] (Ma Sidan, Liu Dongsu. Text Classification Method Based on Weighted Word2vec[J]. Information Science, 2019,37(11):38-42.)
[3] Hinton G E, Krizhevsky A, Wang S D. Transforming Auto-Encoders[C]// Proceedings of the 21st International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2011.
[4] Sabour S, Frosst N, Hinton G E. Dynamic Routing Between Capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 3856-3866.
[5] Zhang N, Deng S, Sun Z, et al. Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 986-992.
[6] Li C, Quan C, Peng L, et al. A Capsule Network for Recommendation and Explaining What You Like and Dislike[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 275-284.
[7] 李钰曼, 陈志泊, 许福. 基于KACC模型的文本分类研究[J]. 数据分析与知识发现, 2019,3(10):89-97.
[7] (Li Yuman, Chen Zhibo, Xu Fu. Classifying Texts with KACC Model[J]. Data Analysis and Knowledge Discovery, 2019,3(10):89-97.)
[8] Katarya R, Arora Y. Study on Text Classification Using Capsule Networks[C]// Proceedings of the 5th International Conference on Advanced Computing & Communication Systems. IEEE, 2019: 501-505.
[9] 刘心惠, 陈文实, 周爱, 等. 基于联合模型的多标签文本分类研究[J]. 计算机工程与应用, 2020,56(14):111-117.
[9] (Liu Xinhui, Chen Wenshi, Zhou Ai, et al. Multi-label Text Classification Based on Joint Model[J]. Computer Engineering and Applications, 2020,56(14):111-117.)
[10] McCallum A, Nigam K. A Comparison of Event Models for Naive Bayes Text Classification[C]// Proceedings of the AAAI-98 Workshop on Learning for Text Categorization. 1998,752(1):41-48.
[11] Joachims T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features[C]// Proceedings of the 10th European Conference on Machine Learning. Springer, Berlin, Heidelberg, 1998: 137-142.
[12] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[13] Zhang X, Zhao J, LeCun Y. Character-level Convolutional Networks for Text Classification[J]. Advances in Neural Information Processing Systems. 2015: 649-657.
[14] Liu P, Qiu X, Huang X. Recurrent Neural Network for Text Classification with Multi-Task Learning[OL]. arXiv Preprint, arXiv:1605.05101.
[15] 朱茂然, 王奕磊, 高松, 等. 中文比较关系的识别:基于注意力机制的深度学习模型[J]. 情报学报, 2019,38(6):612-621.
[15] (Zhu Maoran, Wang Yilei, Gao Song, et al. A Deep-Learning Model Based on Attention Mechanism for Chinese Comparative Relation Detection[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(6):612-621.)
[16] Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489.
[17] Tang D, Qin B, Liu T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1422-1432.
[18] Yang M, Zhao W, Ye J, et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3110-3119.
[19] 冯国明, 张晓冬, 刘素辉. 基于CapsNet的中文文本分类研究[J]. 数据分析与知识发现, 2018,2(12):68-76.
[19] (Feng Guoming, Zhang Xiaodong, Liu Suhui. Classifying Chinese Texts with CapsNet[J]. Data Analysis and Knowledge Discovery, 2018,2(12):68-76.)
[20] 赵琪, 杜彦辉, 芦天亮, 等. 基于capsule-BiGRU的文本相似度分析算法[J/OL]. 计算机工程与应用. http://kns.cnki.net/kcms/detail/11.2127.TP.20200826.1635.010.html.
[20] (Zhao Qi, Du Yanhui, Lu Tianliang, et al. Algorithm of Text Similarity Analysis Based on capsule-BiGRU[J/OL]. Computer Engineering and Applications. http://kns.cnki.net/kcms/detail/11.2127.TP.20200826.1635.010.html.)
[21] Lei K, Fu Q, Yang M, et al. Tag Recommendation by Text Classification with Attention-Based Capsule Network[J]. Neurocomputing, 2020,391:65-73.
doi: 10.1016/j.neucom.2020.01.091
[22] 程艳, 尧磊波, 张光河, 等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J]. 计算机研究与发展, 2020,57(12):2583-2595.
[22] (Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020,57(12):2583-2595.)
[23] 王伟, 孙玉霞, 齐庆杰, 等. 基于BiGRU-attention神经网络的文本情感分类模型[J]. 计算机应用研究, 2019,36(12):3558-3564.
[23] (Wang Wei, Sun Yuxia, Qi Qingjie, et al. Text Sentiment Classification Model Based on BiGRU-attention Neural Network[J]. Application Research of Computers, 2019,36(12):3558-3564.)
[24] Raffel C, Ellis D P W. Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems[OL]. arXiv Preprint, arXiv: 1512.08756.
[25] 余本功, 陈杨楠, 杨颖. 基于nBD-SVM模型的投诉短文本分类[J]. 数据分析与知识发现, 2019,3(5):77-85.
[25] (Yu Bengong, Chen Yangnan, Yang Ying. Classifying Short Text Complaints with nBD-SVM Model[J]. Data Analysis and Knowledge Discovery, 2019,3(5):77-85.)
[26] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[27] Yao L, Mao C, Luo Y. Graph Convolutional Networks for Text Classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019: 7370-7377.
[28] 杨云龙, 孙建强, 宋国超. 基于门控循环单元和胶囊特征的文本情感分析[J]. 计算机应用, 2020,40(9):2531-2535.
[28] (Yang Yunlong, Sun Jianqiang, Song Guochao. Text Sentiment Analysis Based on Gated Recurrent Unit and Capsule Features[J]. Journal of Computer Applications, 2020,40(9):2531-2535.)
[29] Lai S, Xu L, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2673.
[1] 陈杰,马静,李晓峰. 融合预训练模型文本特征的短文本分类方法*[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] 周泽聿,王昊,赵梓博,李跃艳,张小琴. 融合关联信息的GCN文本分类模型构建及其应用研究*[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] 范涛,王昊,吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究*[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[4] 杨晗迅, 周德群, 马静, 罗永聪. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究*[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[5] 尹鹏博,潘伟民,张海军,陈德刚. 基于BERT-BiGA模型的标题党新闻识别研究*[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[6] 谢豪,毛进,李纲. 基于多层语义融合的图文信息情感分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
[7] 韩普,张展鹏,张明淘,顾亮. 基于多特征融合的中文疾病名称归一化研究*[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[8] 段建勇,魏晓鹏,王昊. 基于多角度共同匹配的多项选择机器阅读理解模型 *[J]. 数据分析与知识发现, 2021, 5(4): 134-141.
[9] 王雨竹,谢珺,陈波,续欣莹. 基于跨模态上下文感知注意力的多模态情感分析 *[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[10] 王艳, 王胡燕, 余本功. 基于多特征融合的中文文本分类研究*[J]. 数据分析与知识发现, 2021, 5(10): 1-14.
[11] 蒋翠清,王香香,王钊. 基于消费者关注度的汽车销量预测方法研究*[J]. 数据分析与知识发现, 2021, 5(1): 128-139.
[12] 尹浩然,曹金璇,曹鲁喆,王国栋. 扩充语义维度的BiGRU-AM突发事件要素识别研究*[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[13] 黄露,周恩国,李岱峰. 融合特定任务信息注意力机制的文本表示学习模型*[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[14] 唐晓波,高和璇. 基于关键词词向量特征扩展的健康问句分类研究 *[J]. 数据分析与知识发现, 2020, 4(7): 66-75.
[15] 王思迪,胡广伟,杨巳煜,施云. 基于文本分类的政府网站信箱自动转递方法研究*[J]. 数据分析与知识发现, 2020, 4(6): 51-59.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn