Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (8): 50-62    DOI: 10.11925/infotech.2096-3467.2019.1292
Current Issue | Archive | Adv Search |
Question Classification Based on Bidirectional GRU with Hierarchical Attention and Multi-channel Convolution
Yu Bengong1,2,Zhu Mengdi1()
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
Download: PDF (1221 KB)   HTML ( 12
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a method to extract multi-level features from the question texts, aiming to better understand their semantics and address the issues facing text classification. [Methods] First, we constructed multi-channel attention feature matrices based on the multi-feature attention mechanism at the word level. It enriched the semantic representation of the texts and fully utilized the interrogative words, properties and position features from the questions. Then, we convolved the new matrices to obtain phrase-level feature representation. Third, we rearranged the vector representation and fed data to the bidirectional GRU(Gated Recurrent Unit) to access forward and backward semantic features respectively. Finally, we applied the latent topic attention to strengthen the topic information in the bidirectional contextual features, and generated the final text vector for the classification results. [Results] The accuracy rates of proposed model with three Chinese question datasets were 93.89%, 94.47% and 94.23% respectively, which were 5.82% and 4.50% higher than those of the LSTM and CNN. [Limitations] We only examined our new model with three Chinese question corpus. [Conclusions] The proposed model fully understands the semantic features of question texts, and improves the performance of question classification.

Key wordsQuestion Classification      Multi-channel      Hierarchical Attention      Convolution      GRU     
Received: 02 December 2019      Published: 21 May 2020
ZTFLH:  TP391  
Corresponding Authors: Zhu Mengdi     E-mail:

Cite this article:

Yu Bengong, Zhu Mengdi. Question Classification Based on Bidirectional GRU with Hierarchical Attention and Multi-channel Convolution. Data Analysis and Knowledge Discovery, 2020, 4(8): 50-62.

URL:     OR

The Architecture of HAMCC-BGRU
αie=innerproduct(xe,xi) (1)
Construction of Word Vector Matrix Based on Different Attention Mechanismsαie=innerproduct(xe,xi) (1)
问题类型 相关示例
描述类(DES) 离心式加湿器的原理是什么
人物类(HUM) 哈姆雷特是谁导演的
地点类(LOC) 奥康集团有限公司在哪里成立的
数字类(NUM) 鲁迅的朝花夕拾共有多少字
时间类(TIME) 小说《犯罪学》什么时候出版的
实体类(OBJ) 管理学这本书是哪个出版社出版的
Chinese Question Category System
实验环境 环境配置
操作系统 Windows10企业版
CPU Intel Core i5-4210U 2.40GHz
显卡 AMD Radeon R7 M265
内存 12GB
编程语言 Python 3.7
深度学习库 TensorFlow + Keras
Experimental Environment and Configuration
参数 设定值
卷积核宽度 3
卷积核个数 64
GRU单元数 50
Batch Size 32
Epoch 20
Optimizer Adam
Dropout Rate 0.6
Parameter Settings
Performance of Different Vector Dimensions
Training Time Spent on Different Vector Dimensions
模型 Fudan
Question Bank
NLPCC 2016 NLPCC 2017
SVM 72.86% 72.24% 73.16%
CNN 90.31% 89.97% 90.65%
LSTM 88.92% 88.65% 89.24%
GRU 89.75% 89.83% 89.57%
C-LSTM 91.88% 91.34% 91.75%
C-GRU 91.72% 91.53% 92.04%
MAC-LSTM 92.59% 93.21% 92.92%
HAMCC-BGRU 93.89% 94.47% 94.23%
The Classification Accuracy of Different Models
Classification Accuracy of Different Models
模型 Fudan
Question Bank
NLPCC 2016 NLPCC 2017
C-GRU 91.72% 91.53% 92.04%
IWC-BGRU 92.93% 93.14% 93.04%
PC-BGRU 93.09% 93.26% 93.15%
LC-BGRU 93.19% 93.37% 93.25%
TCC-BGRU 93.22% 93.51% 93.37%
LTC-BGRU 93.04% 93.32% 93.07%
HAMCC-BGRU 93.89% 94.47% 94.23%
The Effect of Different Attention Mechanisms on the Accuracy of the Model
[1] Zhao Z, Yang Q F, Cai D, et al. Video Question Answering via Hierarchical Spatio-Temporal Attention Networks[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 3518-3524.
[2] Sarrouti M, Lachkar A, Ouatik S E A, Biomedical Question Types Classification Using Syntactic and Rule Based Approach[C]// Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K). IEEE, 2015,1:265-272.
[3] Basaj D, Rychalska B, Biecek P, et al. How Much Should You Ask? On the Question Structure in QA Systems[OL]. arXiv Preprint, arXiv: 1809. 03734.
[4] 周源, 刘怀兰, 杜朋朋, 等. 基于改进TF-IDF特征提取的文本分类模型研究[J]. 情报科学, 2017,35(5):111-118.
[4] ( Zhou Yuan, Liu Huailan, Du Pengpeng, et al. Research of Text Classification Model Based on the Improved TF-IDF Feature Extraction[J]. Information Science, 2017,35(5):111-118.)
[5] 邱云飞, 刘聪. 基于协同训练的意图分类优化方法[J]. 现代情报, 2019,39(5):57-63,73.
[5] ( Qiu Yunfei, Liu Cong. Intention Classification Optimization Method Based on Collaborative Training[J]. Journal of Modern Information, 2019,39(5):57-63, 73.)
[6] Xie W, Gao D, Hao T. A Feature Extraction and Expansion-based Approach for Question Target Identification and Classification[C]// Proceedings of the China Conference on Information Retrieval. Springer, 2017: 249-260.
[7] Hasan A M, Zakaria L Q. Question Classification Using Support Vector Machine and Pattern Matching[J]. Journal of Theoretical and Applied Information Technology, 2016,87(2):259-265.
[8] 张青, 吕钊. 基于主题扩展的领域问题分类方法[J]. 计算机工程, 2016,42(9):202-207, 213.
doi: 10.3969/j.issn.1000-3428.2016.09.036
[8] ( Zhang Qing, Lv Zhao. Domain Question Classification Method Based on Topic Expansion[J]. Computer Engineering, 2016,42(9):202-207, 213.)
doi: 10.3969/j.issn.1000-3428.2016.09.036
[9] 冶忠林, 杨燕, 贾真, 等. 基于语义扩展的短问题分类[J]. 计算机应用, 2015,35(3):792-796.
doi: 10.11772/j.issn.1001-9081.2015.03.792
[9] ( Ye Zhonglin, Yang Yan, Jia Zhen, et al. Short Question Classification Based on Semantic Extensions[J]. Journal of Computer Applications, 2015,35(3):792-796.)
doi: 10.11772/j.issn.1001-9081.2015.03.792
[10] 杜慧, 俞晓明, 刘悦, 等. 融合词性和注意力的卷积神经网络对象级情感分类方法[J]. 模式识别与人工智能, 2018,31(12):1120-1126.
[10] ( Du Hui, Yu Xiaoming, Liu Yue, et al. CNN with Part-of-Speech and Attention Mechanism for Targeted Sentiment Classification[J]. Pattern Recognition and Artificial Intelligence, 2018,31(12):1120-1126.)
[11] Bairaktaris A, Symeonidis S, Arampatzis A. DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification[C]// Proceedings of the 13th International Workshop on Semantic Evaluation. 2019: 1155-1159.
[12] Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882.
[13] Xiao G Y, Mo J Q, Chow E, et al. Multi-task CNN for Classification of Chinese Legal Questions[C]// Proceedings of the 2017 IEEE 14th International Conference on e-Business Engineering. 2017: 84-90.
[14] 陈珂, 梁斌, 柯文德, 等. 基于多通道卷积神经网络的中文微博情感分析[J]. 计算机研究与发展, 2018,55(5):945-957.
[14] ( Chen Ke, Liang Bin, Ke Wende, et al. Chinese Micro-blog Sentiment Analysis Based on Multi-channels Convolutional Neural Networks[J]. Journal of Computer Research and Development, 2018,55(5):945-957.)
[15] Tan C Q, Wei F R, Zhou Q Y, et al. Context-aware Answer Sentence Selection with Hierarchical Gated Recurrent Neural Networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2018,26(3):540-549.
[16] Chen S, Zheng B, Hao T Y. Capsule-based Bidirectional Gated Recurrent Unit Networks for Question Target Classification[C]// Proceedings of the 24th China Conference on Information Retrieval. Springer, 2018: 67-77.
[17] Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[18] Zhang Z Q, Robinson D, Tepper J. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network[C]// Proceedings of the European Semantic Web Conference. Springer, 2018: 745-760.
[19] Zhou X Q, Hu B, Chen Q, et al. Recurrent Convolutional Neural Network for Answer Selection in Community Question Answering[J]. Neurocomputing, 2018,274:8-18.
doi: 10.1016/j.neucom.2016.07.082
[20] Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention[C]// Proceedings of the Conference and Workshop on Neural Information Processing Systems. 2014: 2204-2212.
[21] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[22] Chen Q, Hu Q M, Huang X J, et al. Enhancing Recurrent Neural Networks with Positional Attention for Question Answering[C]// Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017: 993-996.
[23] 陶志勇, 李小兵, 刘影, 等. 基于双向长短时记忆网络的改进注意力短文本分类方法[J]. 数据分析与知识发现, 2019,3(12):21-29.
[23] ( Tao Zhiyong, Li Xiaobing, Liu Ying, et al. Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network[J]. Data Analysis and Knowledge Discovery, 2019,3(12):21-29.)
[24] Liu J, Yang Y H, Lv S Q, et al. Attention-based BiGRU-CNN for Chinese Question Classification[J]. Journal of Ambient Intelligence and Humanized Computing. DOI: 10.1007/s12652-019-01344-9.
pmid: 20975986
[25] Shen Y, Deng Y, Yang M, et al. Knowledge-aware Attentive Neural Network for Ranking Question Answer Pairs[C]// Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 901-904.
[26] Yang M, Tu W T, Qu Q, et al. Advanced Community Question Answering by Leveraging External Knowledge and Multi-Task Learning[J]. Knowledge-Based Systems, 2019,171:106-119.
doi: 10.1016/j.knosys.2019.02.006
[27] Yang Z C, Yang D Y, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489.
[28] Yu B G, Xu Q T, Zhang P H. Question Classification Based on MAC-LSTM[C]// Proceedings of the 2018 IEEE 3rd International Conference on Data Science in Cyberspace (DSC). IEEE, 2018: 69-75.
[29] Tran N K, Niedereee C. Multihop Attention Networks for Question Answer Matching[C]// Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018: 325-334.
[30] 朱茂然, 王奕磊, 高松, 等. 中文比较关系的识别:基于注意力机制的深度学习模型[J]. 情报学报, 2019,38(6):612-621.
[30] ( Zhu Maoran, Wang Yilei, Gao Song, et al. A Deep-learning Model Based on Attention Mechanism for Chinese Comparative Relation Detection[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(6):612-621.)
[31] 曾子明, 万品玉. 基于双层注意力和Bi-LSTM的公共安全事件微博情感分析[J]. 情报科学, 2019,37(6):23-29.
[31] ( Zeng Ziming, Wan Pinyu. Sentiment Analysis of Public Safety Events in Micro-blog Based on Double-layered Attention and Bi-LSTM[J]. Information Science, 2019,37(6):23-29.)
[32] 李超, 柴玉梅, 南晓斐, 等. 基于深度学习的问题分类方法研究[J]. 计算机科学, 2016,43(12):115-119.
[32] ( Li Chao, Chai Yumei, Nan Xiaofei, et al. Research on Problem Classification Method Based on Deep Learning[J]. Computer Science, 2016,43(12):115-119.)
[33] Fudan Question Bank[DS/OL]. [2019-07-10].
[1] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[2] Fan Tao,Wang Hao,Wu Peng. Sentiment Analysis of Online Users' Negative Emotions Based on Graph Convolutional Network and Dependency Parsing[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[3] Yin Pengbo,Pan Weimin,Zhang Haijun,Chen Degang. Identifying Clickbait with BERT-BiGA Model[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[4] Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[5] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[6] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[7] Zhang Sifan,Niu Zhendong,Lu Hao,Zhu Yifan,Wang Rongrong. Predicting Citations Based on Graph Convolution Embedding and Feature Cross:Case Study of Transportation Research[J]. 数据分析与知识发现, 2020, 4(9): 56-67.
[8] Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[9] Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[10] Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[11] Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[12] Qin Chenglei,Zhang Chengzhi. Recognizing Structure Functions of Academic Articles with Hierarchical Attention Network[J]. 数据分析与知识发现, 2020, 4(11): 26-42.
[13] Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[14] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[15] Junliang Yao,Xiaoqiu Le. Semantic Matching for Sci-Tech Novelty Retrieval[J]. 数据分析与知识发现, 2019, 3(6): 50-56.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938