|
|
Identifying Citation Texts with Unsupervised Method |
Hyonil Kim,Ou Shiyan() |
School of Information Management, Nanjing University, Nanjing 210023, China |
|
|
Abstract [Objective] This paper proposes a method to automatically identify citation texts and compare the contents of citation sentences. [Methods] We developed an unsupervised method to find the implicit citation sentences and then compared the similarity of these sentences and the citing/cited papers. We combined the vector space and the word embedding models to calcuate the similarity precisely. [Results] We identified the implicit citation sentences of two higly-cited papers from 200 citing articles and found the proposed method’s F-value was above 92%. By comparing the contents of the explicit and implicit citaiton senstences, we noticed their significant difference in citation functions and sentiments. There were more implicit citation sentences for research background and technical basis than the explicit ones. There were also fewer implicit citation sentences for research basis and comparison than the explicit ones. 45.3% of the explicit citation sentences were positive references while 78.8% of implicit citation sentences were neutral. [Limitations] We only investigated citation texts at sentence level. More research is needed to discuss the clause and phrase-level identifications.[Conclusions] The proposed method could effectively identify implicit citation sentences.
|
Received: 11 June 2020
Published: 02 September 2020
|
|
Fund:The work is supported by the National Social Science Fund of China Grant No(17ATQ001) |
Corresponding Authors:
Ou Shiyan
E-mail: oushiyan@nju.edu.cn
|
[1] |
Chen C M . Eugene Garfield’s Scholarly Impact: A Scientometric Review[J]. Scientometrics, 2018,114(2):489-516.
|
[2] |
刘浏, 王东波 . 引用内容分析研究综述[J]. 情报学报, 2017,36(6):637-643.
|
[2] |
( Liu Liu, Wang Dongbo . Review on Citation Context Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2017,36(6):637-643.)
|
[3] |
陈颖芳, 马晓雷 . 基于引用内容与功能分析的科学知识发展演进规律研究[J]. 情报杂志, 2020,39(3):71-80.
|
[3] |
( Chen Yingfang, Ma Xiaolei . Measuring the Developmental Trend of a Knowledge Domain Through Citation Content and Citation Function Analysis[J]. Journal of Intelligence, 2020,39(3):71-80.)
|
[4] |
Tahamtan I, Bornmann L . What do Citation Counts Measure? An Updated Review of Studies on Citations in Scientific Documents Published Between 2006 and 2018[J]. Scientometrics, 2019,121(3):1635-1684.
|
[5] |
吴素研, 吴江瑞, 李文波 . 大规模科技文献深度解析和检索平台构建[J]. 现代情报, 2020,40(1):110-115.
|
[5] |
( Wu Suyan, Wu Jiangrui, Li Wenbo . Construction of Deep Resolution and Retrieval Platform for Large Scale Scientific and Technical Literature[J]. Journal of Modern Information, 2020,40(1):110-115.)
|
[6] |
雷声伟, 陈海华, 黄永 , 等. 学术文献引文上下文自动识别研究[J]. 图书情报工作, 2016,60(17):78-87.
|
[6] |
( Lei Shengwei, Chen Haihua, Huang Yong , et al. Research on Automatic Recognition of Academic Citation Context[J]. Library and Information Service, 2016,60(17):78-87.)
|
[7] |
Bradshaw S. Reference Directed Indexing: Redeeming Relevance for Subject Search in Citation Indexes[C]// Proceedings of the 7th International Conference on Theory and Practice of Digital Libraries (ECDL 2003). Heidelberg, Berlin: Springer, 2003: 499-510.
|
[8] |
Ritchie A, Robertson S, Teufel S, et al. Comparing Citation Contexts for Information Retrieval[C]// Proceedings of the 17th ACM Conference on Information and Knowledge Management. New York, NY: Association for Computing Machinery, 2008: 213-222.
|
[9] |
O’connor J . Citing Statements: Computer Recognition and Use to Improve Retrieval[J]. Information Processing and Management, 1982,18(3):125-131.
|
[10] |
Nanba H, Okumura M. Towards Multi-Paper Summarization Using Reference Information[C]// Proceedings of the 16th International Joint Conference on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann Publishers Inc., 1999: 926-931.
|
[11] |
Kaplan D, Iida R, Tokunaga T. Automatic Extraction of Citation Contexts for Research Paper Summarization: A Coreference-Chain Based Approach[C]// Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL). 2009: 88-95.
|
[12] |
Angrosh M A, Cranefield S, Stanger N, et al. Context Identification of Sentences in Related Work Sections Using a Conditional Random Field: Towards Intelligent Digital Libraries[C]// Proceedings of the 10th Joint Conference on Digital Libraries (JCDL). New York, NY: Association for Computing Machinery, 2010: 293-302.
|
[13] |
Athar A. Sentiment Analysis of Citations Using Sentence Structure-Based Features[C]// Proceedings of the ACL-HLT 2011 Student Session. Stroudsburg, PA: Association for Computational Linguistics, 2011: 81-87.
|
[14] |
Sondhi P, Zhai C X. A Constrained Hidden Markov Model Approach for Non-Explicit Citation Context Extraction[C]// Proceedings of the 2014 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2014: 361-369.
|
[15] |
Qazvinian V, Radev D R. Identifying Non-Explicit Citing Sentences for Citation-Based Summarization[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010: 555-564.
|
[16] |
Jebari C, Cobo M J, Herreraviedma E, et al. A New Approach for Implicit Citation Extraction[C]// Proceedings of the 19th International Conference on Intelligent Data Engineering and Automated Learning. Cham, Switzerland: Springer, 2018: 121-129.
|
[17] |
Mikolov T, Chen K, Corrado G S , et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301. 3781.
|
[18] |
Le Q, Mikolov T. Distributed Representations of Sentences and Documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014: 1188-1196.
|
[19] |
Dong C, Schafer U. Ensemble-Style Self-Training on Citation Classification[C]// Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011: 623-631.
|
[20] |
凌洪飞 . 基于引文文本自动分类的引用内容分析研究[D]. 南京: 南京大学, 2020.
|
[20] |
( Ling Hongfei . A Study on Citation Context Analysis Based on Automatic Citation Text Classification[D]. Nanjing: Nanjing University, 2020.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|