Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (12): 48-59    DOI: 10.11925/infotech.2096-3467.2021.0679
Automatic Classification of Citation Sentiment and Purposes with AttentionSBGMC Model
Zhou Wenyuan,Wang Mingyang(),Jing Yu
College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
[Objective] This paper proposes a deep learning model——AttentionSBGMC to improve the automatic classification of citation sentiment and purposes. [Methods] First, we used the SciBERT pre-training model to obtain the semantic representation vector for the sentences. Then, according to the characteristics of the texts, we used the BiGRU neural network and the multi-scale convolutional neural network (Multi-CNN) to extract their temporal global features and local key features. Third, we utilized the attention model to highlight the key features by redistributing the extracted features’ weights. Finally, we finished the classification tasks with the help of linear layers. [Results] We examined the new method with two citation data sets. With Abu-Jbara data set the F1 values in three classification tasks (for subjective and objective citation emotion, positive and negative citation emotion, and citation purpose) were 86.74%, 91.14% and 84.92%, respectively. With Athar data set the F1 values in two classification tasks (for subjective and objective citation emotion, positive and negative citation emotion) were 88.50%, 86.59%, respectively. [Limitations] The proposed model was only examined on English data sets, which needs to be expanded in the future. [Conclusions] The proposed model could effectively extract the important corpus features, and automatically classify citation sentiment and purposes.

Key wordsCitation Sentiment Classification      SciBERT      Attention Mechanism      BiGRU      Multi-CNN     
Received: 07 July 2021      Published: 20 January 2022
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(71473034)
Zhou Wenyuan, Wang Mingyang, Jing Yu. Automatic Classification of Citation Sentiment and Purposes with AttentionSBGMC Model. Data Analysis and Knowledge Discovery, 2021, 5(12): 48-59.

AttentionSBGMC Model
SciBERT Model
BiGRU Structure Diagram
Multi-CNN Structure Diagram
Schematic Diagram of Attention Mechanism
Schematic Diagram of the Original Data
Example of Citation Fragment Generation
分类任务 种类 比率/%
引用情感 正面 34.50
中性 51.10
负面 14.40
引用目的 批评 16.30
比较 8.10
Data Set 1 Distribution
分类任务 种类 比率/%
正面 10.20
引用情感 中性 86.50
负面 3.30
Data Set 2 Distribution
Text Enhancement Example
实验参数 参数值
词嵌入维度 200
隐藏层大小 100
卷积核大小 1,2,3,4
注意力机制中单位数(维度) 64
注意力头数 5
优化器 Adam
Batch Size 16
Epoch 15
Dropout 0.25
Main Parameters
真实类别 预测类别
正例 反例
正例 TP(真正例) FN(假反例)
反例 FP(假正例) TN(真反例)
Confusion Matrix
实验方法 主客观分类
P/% R/% F1/%
GloVe-BiGRU 72.73 56.15 63.37
BERT-BiGRU 83.84 83.65 83.79
SciBERT-BiGRU 84.85 84.8 84.82
SciBERT-BiGRU-Multi-CNN 85.17 85.04 85.1
SciBERT-Multi-CNN-BiGRU-Attention 85.35 85.86 85.87
SciBERT-BiGRU-Multi-CNN-Attention 86.76 86.72 86.74
Subjective and Objective Classification of the Index Results of Each Model
实验方法 正负面分类
P/% R/% F1/%
GloVe-BiGRU 80.71 59.25 68.33
BERT-BiGRU 88.75 87.32 88.03
SciBERT-BiGRU 90.69 87.71 88.98
SciBERT-BiGRU-Multi-CNN 90.78 89.24 90.01
SciBERT-Multi-CNN-BiGRU-Attention 92.06 89.14 90.58
SciBERT-BiGRU-Multi-CNN-Attention 92.26 90.06 91.14
Positive and Negative Classification of the Index Results of Each Model
实验方法 引用目的分类
P/% R/% F1/%
Glove-BiGRU 68.26 52.18 59.15
BERT-BiGRU 82.79 79.98 81.26
SciBERT-BiGRU 83.26 80.39 81.80
SciBERT-BiGRU-Multi-CNN 84.68 81.59 83.11
SciBERT-Multi-CNN-BiGRU-Attention 85.58 82.75 84.14
SciBERT-BiGRU-Multi-CNN-Attention 86.67 83.24 84.92
The Citation Purpose Classification of the Index Results of Each Model
实验方法 引用情感分类
P/% R/% F1/%
NB with Syntactic Features[30] 69.00 62.50 64.40
SVM with Features [38] 67.10 70.60 68.80
SVM with TF-IDF [20] 77.90 76.30 77.10
SVM with Embedding [20] 81.30 75.40 77.30
CNN with Embedding [20] 82.00 75.90 78.80
LSTM[19] 80.08 74.30 77.40
BiLSTM [19] 80.40 77.56 79.10
SciBERT-BiGRU-Multi-CNN-Attention 83.76 82.63 83.19
The Citation Sentiment Classification of the Index Results of Each Model
实验方法 引用目的分类
P/% R/% F1/%
NB with Syntactic Features[30] 65.02 58.50 60.40
SVM with Features [38] 54.90 62.50 58.40
SVM with TF-IDF [20] 74.30 70.90 72.60
SVM with Embedding [20] 86.80 64.70 74.10
CNN with Embedding [20] 80.80 68.80 74.30
LSTM[19] 79.87 67.80 73.21
BiLSTM [19] 77.22 73.11 75.11
SciBERT-BiGRU-Multi-CNN-Attention 86.67 83.24 84.92
The Citation Purpose Classification of the Index Results of Each Model
实验任务 评价指标
P/% R/% F1/%
正负面分类 87.42 89.60 88.50
主客观分类 85.53 87.64 86.59
引用情感三分类 84.58 86.67 85.61
Classification Experiment Results of Data Set 2
