[Objective] This paper proposes a deep learning model——AttentionSBGMC to improve the automatic classification of citation sentiment and purposes. [Methods] First, we used the SciBERT pre-training model to obtain the semantic representation vector for the sentences. Then, according to the characteristics of the texts, we used the BiGRU neural network and the multi-scale convolutional neural network (Multi-CNN) to extract their temporal global features and local key features. Third, we utilized the attention model to highlight the key features by redistributing the extracted features’ weights. Finally, we finished the classification tasks with the help of linear layers. [Results] We examined the new method with two citation data sets. With Abu-Jbara data set the F1 values in three classification tasks (for subjective and objective citation emotion, positive and negative citation emotion, and citation purpose) were 86.74%, 91.14% and 84.92%, respectively. With Athar data set the F1 values in two classification tasks (for subjective and objective citation emotion, positive and negative citation emotion) were 88.50%, 86.59%, respectively. [Limitations] The proposed model was only examined on English data sets, which needs to be expanded in the future. [Conclusions] The proposed model could effectively extract the important corpus features, and automatically classify citation sentiment and purposes.
(Wu Qin. Research on Quality Evaluation in the Academic Articles Based on the Intensity of Citation[J]. Journal of the China Society for Scientific and Technical Information, 2007, 26(4): 522-526.)
(Wang Yan, Liu Yajuan. Exploration on Assessing Paper by Citation Analysis: Using Impact Factors in Citation Analysis[J]. Science Research Management, 2001, 22(1): 133-138.)
(Zhong Wenyi, Chen Yunpeng. Research on Influence Evaluation in the Academic Articles Based on the Citation Index[J]. Information Science, 2011, 29(5): 706-712.)
(Ye Ying. An Outline of Academic Assessment with the Citation Data of High-Quality Papers[J]. Journal of Library Science in China, 2010, 36(1): 100-103.)
[5]
Sendhilkumar S, Elakkiya E, Mahalakshmi G S. Citation Semantic Based Approaches to Identify Article Quality [C]// Proceedings of the 3rd International Conference on Computer Science, Engineering & Applications. 2013. DOI: 10.5121/CSIT.2013.3543.
doi: 10.5121/CSIT.2013.3543
[6]
Parthasarathy G, Tomar D C. Sentiment Analyzer: Analysis of Journal Citations from Citation Databases [C]//Proceedings of the 5th International Conference-Confluence the Next Generation Information Technology Summit (Confluence). IEEE, 2014: 923-928.
[7]
Goodarzi M, Mahmoudi M T, Zamani R. A Framework for Sentiment Analysis on Schema-Based Research Content via Lexica Analysis [C]//Proceedings of the 7th International Symposium on Telecommunications. IEEE, 2014: 405-411.
[8]
Baccianella S, Esuli A, Sebastiani F. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining [C]//Proceedings of the International Conference on Language Resources and Evaluation. 2010.
(Liao Junhua, Liu Ziqiang, Bai Rujiang, et al. Citation Sentiment Recognition Method Based on Citation Content Analysis[J]. Library and Information Service, 2018, 62(15): 112-121.)
[10]
Athar A. Sentiment Analysis of Citations Using Sentence Structure-Based Features [C]//Proceedings of the ACL 2011 Student Session. 2011: 81-87.
[11]
Athar A, Teufel S. Detection of Implicit Citations for Sentiment Detection [C]//Proceedings of the Workshop on Detecting Structure in Scholarly Discourse. 2012: 18-26.
[12]
Radev D R, Muthukrishnan P, Qazvinian V, et al. The ACL Anthology Network Corpus[J]. Language Resources and Evaluation, 2013, 47(4): 919-944.
doi: 10.1007/s10579-012-9211-2
[13]
Sula C A, Miller M. Citations, Contexts, and Humanistic Discourse: Toward Automatic Extraction and Classification[J]. Literary and Linguistic Computing, 2014, 29(3): 452-464.
doi: 10.1093/llc/fqu019
[14]
Kim I C, Thoma G R. Automated Classification of Author’s Sentiments in Citation Using Machine Learning Techniques: A Preliminary Study [C]//Proceedings of 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology. IEEE, 2015: 1-7.
[15]
Ikram M T, Afzal M T. Aspect Based Citation Sentiment Analysis Using Linguistic Patterns for Better Comprehension of Scientific Knowledge[J]. Scientometrics, 2019, 119(1): 73-95.
doi: 10.1007/s11192-019-03028-9
(Liu Shengbo, Ding Kun. Citation Evaluation Analysis Based on Citation Context[C]// Proceedings of the the 9th China Science and Technology Policy and Management Annual Conference. 2013: 7.)
[17]
Xu J, Zhang Y, Wu Y, et al. Citation Sentiment Analysis in Clinical Trial Papers[J]. AMIA Annual Symposium Proceedings, 2015:1334-1341.
[18]
冷东天. 基于语义的引用内容情感分析及其应用研究[D]. 哈尔滨: 东北林业大学, 2020.
[18]
(Leng Dongtian. Research on Citation Sentiment Analysis Based on Semantics in Citation Context and Its Application[D]. Harbin: Northeast Forestry University, 2020.)
[19]
Munkhdalai T, Lalor J, Yu H. Citation Analysis with Neural Attention Models [C]//Proceedings of the the 7th International Workshop on Health Text Mining and Information Analysis. 2016: 69-77.
[20]
Lauscher A, Glavaš G, Ponzetto S P, et al. Investigating Convolutional Networks and Domain-Specific Embeddings for Semantic Classification of Citations [C]//Proceedings of the 6th International Workshop on Mining Scientific Publications. 2017: 24-28.
[21]
Vyas V, Ravi K, Ravi V, et al. Article Citation Study: Context Enhanced Citation Sentiment Detection[OL]. arXiv Preprint, arXiv:2005.04534.
[22]
Brooks T A. Private Acts and Public Objects: An Investigation of Citer Motivations[J]. Journal of the American Society for Information Science, 1985, 36(4): 223-229.
doi: 10.1002/(ISSN)1097-4571
[23]
Tang R, Safer M A. Author-Rated Importance of Cited References in Biology and Psychology Publications[J]. Journal of Documentation, 2008, 64(2): 246-272.
doi: 10.1108/00220410810858047
(Qiu Junping, Chen Xiaoyu, He Wenjing. Study on Paper Citation Motivations and Mutual Influence of Researchers[J]. Library and Information Service, 2015, 59(9): 36-44.)
[26]
Garfield E. Can Citation Indexing be Automated?[J]. Essays of an Information Scientist, 1962, 1: 84-90.
[27]
Lipetz B A. Improvement of the Selectivity of Citation Indexes to Science Literature Through Inclusion of Citation Relationship Indicators[J]. American Documentation, 1965, 16(2): 81-90.
doi: 10.1002/(ISSN)1936-6108
[28]
Moravcsik M J, Murugesan P. Some Results on the Function and Quality of Citations[J]. Social Studies of Science, 1975, 5(1): 86-92.
doi: 10.1177/030631277500500106
[29]
Teufel S, Siddharthan A, Tidhar D. Automatic Classification of Citation Function [C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006: 103-110.
[30]
Dong C L, Schäfer U. Ensemble-style Self-training on Citation Classification [C]//Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011: 623-631.
[31]
Hernández-Alvarez M, Gómez J M. Citation Impact Categorization: For Scientific Literature [C]//Proceedings of 2015 IEEE 18th International Conference on Computational Science and Engineering. IEEE, 2015: 307-313.
(Yin Li, Guo Lu, Li Xufen. An Empirical Study on Citation Classification Based on Citation Function and Citation Polarity[J]. Journal of Intelligence, 2018, 37(7): 139-145.)
[35]
Kolen J F, Kremer S C. Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies[A]// A Field Guide to Dynamical Recurrent Networks[M]. Willey, 2001.
[36]
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
pmid: 9377276
[37]
Dey R, Salem F M. Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks [C]//Proceedings of 2017 IEEE 60th International Midwest Symposium on Circuits and Systems. IEEE, 2017: 1597-1600.
[38]
Abu-Jbara A, Ezra J, Radev D. Purpose and Polarity of Citation: Towards NLP-Based Bibliometrics [C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013: 596-606.