|
|
Extracting Citation Contents with Coreference Resolution |
Tan Ying1( ),Tang Yifei2 |
1School of Public Administration, Hubei University, Wuhan 430062, China 2School of Information Management, Central China Normal University, Wuhan 430079, China |
|
|
Abstract [Objective] This paper aims to accurately extract scientific citations and their context data, which significantly improves the results of citation analysis. [Methods] We divided the citation extraction task into citation sentence extraction, citation context identification, and citation metadata. Then, we proposed a coreference resolution-based method to identify and extract scientific citation context. [Results] We examined our method with the Chinese sequential coding periodicals and extracted the citation sentences and references correctly. The F1 value for identifying the citation context was between 0.780 and 0.849. [Limitations] Due to the limits of Chinese scientific citation corpus and the small scale of experimental data, the proposed method might not work effectively in other fields. [Conclusions] Our study optimizes the steps of citation content analysis and enlarges data scope. It provides support for researchers of citation content analysis.
|
Received: 08 March 2021
Published: 15 September 2021
|
|
Fund:National Social Science Fund of China(19ZDA345) |
Corresponding Authors:
Tan Ying ORCID:0000-0002-7987-4696
E-mail: tanying1219@qq.com
|
[1] |
Small H. Citations and Consilience in Science[J]. Scientometrics, 1998, 43(1):143-148.
doi: 10.1007/BF02458403
|
[2] |
Bergmark D, Phempoonpanich P, Zhao S M. Scraping the ACM Digital Library[J]. ACM SIGIR Forum, 2001, 35(2):1-7.
|
[3] |
Bergmark D. Automatic Extraction of Reference Linking Information from Online Documents[R]. Cornell University, 2000.
|
[4] |
Sarawagi S, Vydiswaran V G V, Srinivasan S, et al. Resolving Citations in a Paper Repository[J]. ACM SIGKDD Explorations Newsletter, 2003, 5(2):156-157.
doi: 10.1145/980972.980995
|
[5] |
Giles C L, Bollacker K D, Lawrence S. CiteSeer: An Automatic Citation Indexing System[C]// Proceedings of the 3rd ACM Conference on Digital Libraries. 1998: 89-98.
|
[6] |
Wellner B, McCallum A, Peng F C, et al. An Integrated, Conditional Model of Information Extraction and Coreference with Applications to Citation Matching[C]// Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 2004: 593-601.
|
[7] |
Takasu A. Bibliographic Attribute Extraction from Erroneous References Based on a Statistical Model[C]// Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries. IEEE Computer Society, 2003: 49-60.
|
[8] |
Ding Y, Chowdhury G, Foo S. Template Mining for the Extraction of Citation from Digital Documents[C]// Proceedings of the 2nd Asian Digital Library Conference. 1999: 47-62.
|
[9] |
Nanba H, Okumura M. Towards Multi-paper Summarization Using Reference Information[C]// Proceedings of International Joint Conference on Artificial Intelligence. 1999: 926-931.
|
[10] |
Nanba H, Kando N, Okumura M. Classification of Research Papers Using Citation Links and Citation Types: Towards Automatic Review Article Generation[J]. Advances in Classification Research Online, 2011, 11(1):117-134.
|
[11] |
Mei Q Z, Zhai C X. Generating Impact-Based Summaries for Scientific Literature[C]// Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. 2008: 816-824.
|
[12] |
Abu-Jbara A, Radev D. Reference Scope Identification in Citing Sentences[C]// Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2012: 80-90.
|
[13] |
Qazvinian V, Radev D R. Identifying Non-explicit Citing Sentences for Citation-based Summarization[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 555-564.
|
[14] |
Qazvinian V, Radev D R. Scientific Paper Summarization Using Citation Summary Networks[OL]. arXiv Preprint, arXiv: 0807. 1560.
|
[15] |
Teufel S, Siddharthan A, Tidhar D. Automatic Classification of Citation Function[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006: 103-110.
|
[16] |
Teufel S, Siddharthan A, Tidhar D. An Annotation Scheme for Citation Function[C]// Proceedings of the 7th SIGDIAL Workshop on Discourse and Dialogue. 2006: 80-87.
|
[17] |
Athar A, Teufel S. Context-enhanced Citation Sentiment Detection[C]// Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2012: 597-601.
|
[18] |
雷声伟, 陈海华, 黄永, 等. 学术文献引文上下文自动识别研究[J]. 图书情报工作, 2016, 60(17):78-87.
|
[18] |
( Lei Shengwei, Chen Haihua, Huang Yong, et al. Research on Automatic Recognition of Academic Citation Context[J]. Library and Information Service, 2016, 60(17):78-87.)
|
[19] |
章成志, 徐津, 马舒天. 学术文本被引片段的自动识别研究[J]. 情报理论与实践, 2019, 42(9):139-145.
|
[19] |
( Zhang Chengzhi, Xu Jin, Ma Shutian. Automatic Identification of Cited Spans in Academic Articles[J]. Information Studies: Theory & Application, 2019, 42(9):139-145.)
|
[20] |
McCarth J F, Lenhner W G. Using Decision Trees for Coreference Resolution[OL]. arXiv Preprint, arXiv: cmp-lg/9505043, 1995.
|
[21] |
Soon W M, NG H T, Lim D C Y. A Machine Learning Approach to Coreference Resolution of Noun Phrases[J]. Computational Linguistics, 2001, 27(4):521-544.
doi: 10.1162/089120101753342653
|
[22] |
Ng V, Cardie C. Improving Machine Learning Approaches to Coreference Resolution[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002: 104-111.
|
[23] |
Lee H, Peirsman Y, Chang A, et al. Stanford’s Multi-pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task[C]// Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task. 2011: 28-34.
|
[24] |
Chen C, Ng V. Chinese Noun Phrase Coreference Resolution: Insights into the State of the Art[C]// Proceedings of COLING 2012. 2012:185-194.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|