Please wait a minute...
Advanced Search
数据分析与知识发现  2020, Vol. 4 Issue (8): 50-62     https://doi.org/10.11925/infotech.2096-3467.2019.1292
     研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于层级注意力多通道卷积双向GRU的问题分类研究*
余本功1,2,朱梦迪1()
1合肥工业大学管理学院 合肥 230009
2合肥工业大学过程优化与智能决策教育部重点实验室 合肥 230009
Question Classification Based on Bidirectional GRU with Hierarchical Attention and Multi-channel Convolution
Yu Bengong1,2,Zhu Mengdi1()
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
全文: PDF (1221 KB)   HTML ( 14
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】针对对话问句长度较短、特征稀疏等特点,多方面分层级地提取问题文本的特征,更好地理解问句含义,提升分类效果。【方法】为丰富问题文本的语义表示,充分考虑问句中的疑问词、词性、词语位置特征,在词语级基于多特征注意力机制得到多通道特征矩阵;然后利用卷积神经网络对多通道特征矩阵进行深层次短语级特征提取,并将短语级特征融合后输入双向门控循环单元(GRU)获得前后向的上下文信息;最后,为强化前后向上下文特征中的主题信息,使用潜在主题注意力得到双向文本特征,并将双向特征融合得到最终的文本向量,输入Softmax得到分类结果。【结果】本文提出的层级注意力多通道卷积双向GRU模型在三个中文问题数据集上准确率分别达到93.89%、94.47%、94.23%,比LSTM模型、CNN模型分别最高提升5.82%和4.50%。【局限】仅使用三个中文问题语料进行验证。【结论】 本文模型能够更加全面深层次地挖掘问题文本语义特征,弥补问句意图理解不准确的不足,提高了问题分类性能。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
余本功
朱梦迪
关键词 问题分类多通道层级注意力卷积门控循环单元    
Abstract

[Objective] This paper proposes a method to extract multi-level features from the question texts, aiming to better understand their semantics and address the issues facing text classification. [Methods] First, we constructed multi-channel attention feature matrices based on the multi-feature attention mechanism at the word level. It enriched the semantic representation of the texts and fully utilized the interrogative words, properties and position features from the questions. Then, we convolved the new matrices to obtain phrase-level feature representation. Third, we rearranged the vector representation and fed data to the bidirectional GRU(Gated Recurrent Unit) to access forward and backward semantic features respectively. Finally, we applied the latent topic attention to strengthen the topic information in the bidirectional contextual features, and generated the final text vector for the classification results. [Results] The accuracy rates of proposed model with three Chinese question datasets were 93.89%, 94.47% and 94.23% respectively, which were 5.82% and 4.50% higher than those of the LSTM and CNN. [Limitations] We only examined our new model with three Chinese question corpus. [Conclusions] The proposed model fully understands the semantic features of question texts, and improves the performance of question classification.

Key wordsQuestion Classification    Multi-channel    Hierarchical Attention    Convolution    GRU
收稿日期: 2019-12-02      出版日期: 2020-05-21
ZTFLH:  TP391  
基金资助:*本文系国家自然科学基金项目"基于制造大数据的产品研发知识集成与服务机制研究"的研究成果之一(71671057)
通讯作者: 朱梦迪     E-mail: 2466004852@qq.com
引用本文:   
余本功, 朱梦迪. 基于层级注意力多通道卷积双向GRU的问题分类研究*[J]. 数据分析与知识发现, 2020, 4(8): 50-62.
Yu Bengong, Zhu Mengdi. Question Classification Based on Bidirectional GRU with Hierarchical Attention and Multi-channel Convolution. Data Analysis and Knowledge Discovery, 2020, 4(8): 50-62.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2019.1292      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2020/V4/I8/50
Fig.1  HAMCC-BGRU模型结构
Fig.2  基于不同注意力的词向量矩阵的构建
问题类型 相关示例
描述类(DES) 离心式加湿器的原理是什么
人物类(HUM) 哈姆雷特是谁导演的
地点类(LOC) 奥康集团有限公司在哪里成立的
数字类(NUM) 鲁迅的朝花夕拾共有多少字
时间类(TIME) 小说《犯罪学》什么时候出版的
实体类(OBJ) 管理学这本书是哪个出版社出版的
Table 1  中文问题分类体系
实验环境 环境配置
操作系统 Windows10企业版
CPU Intel Core i5-4210U 2.40GHz
显卡 AMD Radeon R7 M265
内存 12GB
编程语言 Python 3.7
深度学习库 TensorFlow + Keras
Table 2  实验环境及其配置
参数 设定值
卷积核宽度 3
卷积核个数 64
GRU单元数 50
Batch Size 32
Epoch 20
Optimizer Adam
Dropout Rate 0.6
Table 3  模型参数设置
Fig.3  不同词向量维度的效果
Fig.4  不同词向量维度所需的训练时间
模型 Fudan
Question Bank
NLPCC 2016 NLPCC 2017
SVM 72.86% 72.24% 73.16%
CNN 90.31% 89.97% 90.65%
LSTM 88.92% 88.65% 89.24%
GRU 89.75% 89.83% 89.57%
C-LSTM 91.88% 91.34% 91.75%
C-GRU 91.72% 91.53% 92.04%
MAC-LSTM 92.59% 93.21% 92.92%
HAMCC-BGRU 93.89% 94.47% 94.23%
Table 4  不同模型的分类准确率对比
Fig.5  不同模型的分类准确率对比
模型 Fudan
Question Bank
NLPCC 2016 NLPCC 2017
C-GRU 91.72% 91.53% 92.04%
IWC-BGRU 92.93% 93.14% 93.04%
PC-BGRU 93.09% 93.26% 93.15%
LC-BGRU 93.19% 93.37% 93.25%
TCC-BGRU 93.22% 93.51% 93.37%
LTC-BGRU 93.04% 93.32% 93.07%
HAMCC-BGRU 93.89% 94.47% 94.23%
Table 5  不同注意力机制对模型准确率的影响
[1] Zhao Z, Yang Q F, Cai D, et al. Video Question Answering via Hierarchical Spatio-Temporal Attention Networks[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 3518-3524.
[2] Sarrouti M, Lachkar A, Ouatik S E A, Biomedical Question Types Classification Using Syntactic and Rule Based Approach[C]// Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K). IEEE, 2015,1:265-272.
[3] Basaj D, Rychalska B, Biecek P, et al. How Much Should You Ask? On the Question Structure in QA Systems[OL]. arXiv Preprint, arXiv: 1809. 03734.
[4] 周源, 刘怀兰, 杜朋朋, 等. 基于改进TF-IDF特征提取的文本分类模型研究[J]. 情报科学, 2017,35(5):111-118.
[4] ( Zhou Yuan, Liu Huailan, Du Pengpeng, et al. Research of Text Classification Model Based on the Improved TF-IDF Feature Extraction[J]. Information Science, 2017,35(5):111-118.)
[5] 邱云飞, 刘聪. 基于协同训练的意图分类优化方法[J]. 现代情报, 2019,39(5):57-63,73.
[5] ( Qiu Yunfei, Liu Cong. Intention Classification Optimization Method Based on Collaborative Training[J]. Journal of Modern Information, 2019,39(5):57-63, 73.)
[6] Xie W, Gao D, Hao T. A Feature Extraction and Expansion-based Approach for Question Target Identification and Classification[C]// Proceedings of the China Conference on Information Retrieval. Springer, 2017: 249-260.
[7] Hasan A M, Zakaria L Q. Question Classification Using Support Vector Machine and Pattern Matching[J]. Journal of Theoretical and Applied Information Technology, 2016,87(2):259-265.
[8] 张青, 吕钊. 基于主题扩展的领域问题分类方法[J]. 计算机工程, 2016,42(9):202-207, 213.
doi: 10.3969/j.issn.1000-3428.2016.09.036
[8] ( Zhang Qing, Lv Zhao. Domain Question Classification Method Based on Topic Expansion[J]. Computer Engineering, 2016,42(9):202-207, 213.)
doi: 10.3969/j.issn.1000-3428.2016.09.036
[9] 冶忠林, 杨燕, 贾真, 等. 基于语义扩展的短问题分类[J]. 计算机应用, 2015,35(3):792-796.
doi: 10.11772/j.issn.1001-9081.2015.03.792
[9] ( Ye Zhonglin, Yang Yan, Jia Zhen, et al. Short Question Classification Based on Semantic Extensions[J]. Journal of Computer Applications, 2015,35(3):792-796.)
doi: 10.11772/j.issn.1001-9081.2015.03.792
[10] 杜慧, 俞晓明, 刘悦, 等. 融合词性和注意力的卷积神经网络对象级情感分类方法[J]. 模式识别与人工智能, 2018,31(12):1120-1126.
[10] ( Du Hui, Yu Xiaoming, Liu Yue, et al. CNN with Part-of-Speech and Attention Mechanism for Targeted Sentiment Classification[J]. Pattern Recognition and Artificial Intelligence, 2018,31(12):1120-1126.)
[11] Bairaktaris A, Symeonidis S, Arampatzis A. DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification[C]// Proceedings of the 13th International Workshop on Semantic Evaluation. 2019: 1155-1159.
[12] Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882.
[13] Xiao G Y, Mo J Q, Chow E, et al. Multi-task CNN for Classification of Chinese Legal Questions[C]// Proceedings of the 2017 IEEE 14th International Conference on e-Business Engineering. 2017: 84-90.
[14] 陈珂, 梁斌, 柯文德, 等. 基于多通道卷积神经网络的中文微博情感分析[J]. 计算机研究与发展, 2018,55(5):945-957.
[14] ( Chen Ke, Liang Bin, Ke Wende, et al. Chinese Micro-blog Sentiment Analysis Based on Multi-channels Convolutional Neural Networks[J]. Journal of Computer Research and Development, 2018,55(5):945-957.)
[15] Tan C Q, Wei F R, Zhou Q Y, et al. Context-aware Answer Sentence Selection with Hierarchical Gated Recurrent Neural Networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2018,26(3):540-549.
[16] Chen S, Zheng B, Hao T Y. Capsule-based Bidirectional Gated Recurrent Unit Networks for Question Target Classification[C]// Proceedings of the 24th China Conference on Information Retrieval. Springer, 2018: 67-77.
[17] Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[18] Zhang Z Q, Robinson D, Tepper J. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network[C]// Proceedings of the European Semantic Web Conference. Springer, 2018: 745-760.
[19] Zhou X Q, Hu B, Chen Q, et al. Recurrent Convolutional Neural Network for Answer Selection in Community Question Answering[J]. Neurocomputing, 2018,274:8-18.
doi: 10.1016/j.neucom.2016.07.082
[20] Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention[C]// Proceedings of the Conference and Workshop on Neural Information Processing Systems. 2014: 2204-2212.
[21] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[22] Chen Q, Hu Q M, Huang X J, et al. Enhancing Recurrent Neural Networks with Positional Attention for Question Answering[C]// Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017: 993-996.
[23] 陶志勇, 李小兵, 刘影, 等. 基于双向长短时记忆网络的改进注意力短文本分类方法[J]. 数据分析与知识发现, 2019,3(12):21-29.
[23] ( Tao Zhiyong, Li Xiaobing, Liu Ying, et al. Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network[J]. Data Analysis and Knowledge Discovery, 2019,3(12):21-29.)
[24] Liu J, Yang Y H, Lv S Q, et al. Attention-based BiGRU-CNN for Chinese Question Classification[J]. Journal of Ambient Intelligence and Humanized Computing. DOI: 10.1007/s12652-019-01344-9.
pmid: 20975986
[25] Shen Y, Deng Y, Yang M, et al. Knowledge-aware Attentive Neural Network for Ranking Question Answer Pairs[C]// Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 901-904.
[26] Yang M, Tu W T, Qu Q, et al. Advanced Community Question Answering by Leveraging External Knowledge and Multi-Task Learning[J]. Knowledge-Based Systems, 2019,171:106-119.
doi: 10.1016/j.knosys.2019.02.006
[27] Yang Z C, Yang D Y, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489.
[28] Yu B G, Xu Q T, Zhang P H. Question Classification Based on MAC-LSTM[C]// Proceedings of the 2018 IEEE 3rd International Conference on Data Science in Cyberspace (DSC). IEEE, 2018: 69-75.
[29] Tran N K, Niedereee C. Multihop Attention Networks for Question Answer Matching[C]// Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018: 325-334.
[30] 朱茂然, 王奕磊, 高松, 等. 中文比较关系的识别:基于注意力机制的深度学习模型[J]. 情报学报, 2019,38(6):612-621.
[30] ( Zhu Maoran, Wang Yilei, Gao Song, et al. A Deep-learning Model Based on Attention Mechanism for Chinese Comparative Relation Detection[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(6):612-621.)
[31] 曾子明, 万品玉. 基于双层注意力和Bi-LSTM的公共安全事件微博情感分析[J]. 情报科学, 2019,37(6):23-29.
[31] ( Zeng Ziming, Wan Pinyu. Sentiment Analysis of Public Safety Events in Micro-blog Based on Double-layered Attention and Bi-LSTM[J]. Information Science, 2019,37(6):23-29.)
[32] 李超, 柴玉梅, 南晓斐, 等. 基于深度学习的问题分类方法研究[J]. 计算机科学, 2016,43(12):115-119.
[32] ( Li Chao, Chai Yumei, Nan Xiaofei, et al. Research on Problem Classification Method Based on Deep Learning[J]. Computer Science, 2016,43(12):115-119.)
[33] Fudan Question Bank[DS/OL]. [2019-07-10]. http://code.google.com/p/fudannlp/w/edit/QuestionClassification.
[1] 范少萍,赵雨宣,安新颖,吴清强. 基于卷积神经网络的医学实体关系分类模型研究*[J]. 数据分析与知识发现, 2021, 5(9): 75-84.
[2] 周泽聿,王昊,赵梓博,李跃艳,张小琴. 融合关联信息的GCN文本分类模型构建及其应用研究*[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] 范涛,王昊,吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究*[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[4] 余本功,朱晓洁,张子薇. 基于多层次特征提取的胶囊网络文本分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[5] 孟镇,王昊,虞为,邓三鸿,张宝隆. 基于特征融合的声乐分类研究*[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[6] 韩普,张展鹏,张明淘,顾亮. 基于多特征融合的中文疾病名称归一化研究*[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[7] 韩普, 张伟, 张展鹏, 王宇欣, 方浩宇. 基于特征融合和多通道的突发公共卫生事件微博情感分析*[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[8] 张思凡,牛振东,陆浩,朱一凡,王荣荣. 基于图卷积嵌入与特征交叉的文献被引量预测方法:以交通运输领域为例*[J]. 数据分析与知识发现, 2020, 4(9): 56-67.
[9] 邱尔丽,何鸿魏,易成岐,李慧颖. 基于字符级CNN技术的公共政策网民支持度研究 *[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[10] 刘伟江,魏海,运天鹤. 基于卷积神经网络的客户信用评估模型研究*[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[11] 徐月梅,刘韫文,蔡连侨. 基于深度融合特征的政务微博转发规模预测模型*[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[12] 向菲,谢耀谈. 基于混合采样与迁移学习的患者评论识别模型*[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[13] 彭郴,吕学强,孙宁,张乐,姜肇财,宋黎. 基于CNN的消费品缺陷领域词典构建方法研究*[J]. 数据分析与知识发现, 2020, 4(11): 112-120.
[14] 徐彤彤,孙华志,马春梅,姜丽芬,刘逸琛. 基于双向长效注意力特征表达的少样本文本分类模型研究*[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[15] 余本功,曹雨蒙,陈杨楠,杨颖. 基于nLD-SVM-RF的短文本分类研究*[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn