Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (1): 89-98    DOI: 10.11925/infotech.2096-3467.2019.0869
Current Issue | Archive | Adv Search |
Automatic Identification of Term Citation Object with Feature Fusion
Na Ma1,2,Zhixiong Zhang1,2,3,4(),Pengmin Wu5
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2School of Economic and Management, University of Chinese Academy of Sciences, Beijing 100190, China
3Wuhan Library, Chinese Academy of Sciences, Wuhan 430071, China
4Hubei Key Laboratory of Big Data in Science and Technology, Wuhan 430071, China
5Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Download: PDF (907 KB)   HTML ( 19
Export: BibTeX | EndNote (RIS)      

[Objective] This paper explores methods automatically identifying term citation objects from scientific papers, with feature fusion and pseudo-label noise reduction strategy.[Methods] First, we converted the identification of term citation objects into sequential annotation. Then, we combined linguistic and heuristic features of term citation objects in the BiLSTM-CNN-CRF input layer, which enhanced their feature representations. Finally, we designed pseudo-label learning noise reduction mechanism, and compared the performance of different models.[Results] The optimal F1 value of our method reached 0.6018, which was 8% higher than that of the BERT model.[Limitations] The experimental data was collected from computer science articles, thus, our model needs to be examined with data from other fields.[Conclusions] The proposed method could effectively identify term citation objects.

Key wordsCitation Object Identification      Feature Fusion      Pseudo-Label Learning      BiLSTM-CNN-CRF     
Received: 23 July 2019      Published: 14 March 2020
ZTFLH:  TP391  
Corresponding Authors: Zhixiong Zhang     E-mail:

Cite this article:

Na Ma,Zhixiong Zhang,Pengmin Wu. Automatic Identification of Term Citation Object with Feature Fusion. Data Analysis and Knowledge Discovery, 2020, 4(1): 89-98.

URL:     OR

CNN Model for Extracting Char-level Representations
Feature Representation of Term Citation Object
Pseudo-label Learning Flowchart
LSTM层数 2
神经单元数量 100
学习率 0.015
Dropout率 0.5
损失函数 交叉熵损失函数
Batch_size 10
优化器 Adam
L2(权重衰减率) 1.0e-8
Char_max_len 20
卷积核大小 3*3
卷积核数量 30
句子最大长度 300
Parameters for Experiments
模型-特征 Precision Recall F1
BiLSTM-CNN-CRF(Baseline) 25.57% 8.17% 12.38%
BiLSTM-CNN-CRF(POS) 30.43% 15.26% 20.33%
BiLSTM-CNN-CRF(POS+REF) 60.42% 49.15% 54.21%
BiLSTM-CNN-CRF(POS+DIS) 61.18% 51.07% 55.67%
BiLSTM-CNN-CRF(REF+DIS) 61.71% 56.02% 58.73%
BiLSTM-CNN-CRF(POS+REF+DIS) 62.96% 57.63% 60.18%
BERT 52.13% 51.55% 51.94%
Results with Different Features on Test Dataset
预测模型 预测结果
BiLSTM-CNN-CRF(POS+REF+DIS) We have adopted the Conditional Maximum Entropy (MaxEnt) modeling paradigm as outlined in REF3 and REF19
To quickly (and approximately) evaluate this phenomenon, we trained the statistical IBM word-alignment model 4 REF7, using the GIZA ++ software REF11 for the following language pairs: Chinese-English, Italian-English, and Dutch-English, using the IWSLT-2006 corpus REF23 for the first two language pairs, and the Europarl corpus REF9 for the last one.
In computational linguistic literature, much effort has been devoted to phonetic transliteration, such as English-Arabic, English-Chinese REF5, English-Japanese REF6 and English-Korean.
Tokenisation, species word identification and chunking were implemented in-house using the LTXML2 tools REF4, whilst abbreviation extraction used the Schwartz and Hearst abbreviation extractor REF9 and lemmatisation used morpha REF12.
Examples of Differences Between Labeled Results and Predicted Results
[1] Ding Y, Zhang G, Chambers T , et al. Content-based Citation Analysis: The Next Generation of Citation Analysis[J]. Journal of the Association for Information Science and Technology, 2014,65(9):1820-1833.
[2] 赵蓉英, 曾宪琴, 陈必坤 . 全文本引文分析——引文分析的新发展[J]. 图书情报工作, 2014,58(9):129-135.
[2] ( Zhao Rongying, Zeng Xianqin, Chen Bikun . Citation in Full-text:The Development of Citation Analysis[J]. Library & Information Service, 2014,58(9):129-135.)
[3] Small H G . Cited Documents as Concept Symbols[J]. Social Studies of Science, 1978,8(3):327-340.
[4] Qazvinian V, Radev D R. Scientific Paper Summarization Using Citation Summary Networks [C]// Proceedings of the 22nd International Conference on Computational Linguistics, Manchester. Association for Computational Linguistics, 2008: 689-696.
[5] Qazvinian V, Radev D R, Ozgur A. Citation Summarization Through Keyphrase Extraction [C]// Proceedings of the 23rd International Conference on Computational Linguistics, Beijing. Association for Computational Linguistics, 2010: 895-903.
[6] Jha R, Finegan-Dollak C, King B, et al. Content Models for Survey Generation: A Factoid-Based Evaluation [C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing. Association for Computational Linguistics, 2015,1:441-450.
[7] Anderson M H, Sun P Y T . What Have Scholars Retrieved from Walsh and Ungson (1991)? A Citation Context Study[J]. Management Learning, 2010,41(2):131-145.
[8] Radoulov R . Exploring Automatic Citation Classification[D]. Waterloo: University of Waterloo, 2008.
[9] 许德山 . 科技论文引用中的观点倾向分析[D]. 北京:中国科学院文献情报中心, 2012.
[9] ( Xu Deshan . Sentiment Orientation Analysis for Evaluation Information of Citation on Scientific & Technical Paper[D].Bejing: National Science Library, Chinese Academy of Sciences, 2012.)
[10] Khalid A, Khan F A, Imran M , et al. Reference Terms Identification of Cited Articles as Topics from Citation Contexts[J]. Computers and Electrical Engineering, 2019,74:569-580.
[11] Ma X, Hovy E . End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany. Association for Computational Linguistics, 2016: 1064-1074.
[12] Bengio Y, Ducharme R, Vincent P , et al. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003,3:1137-1155
[13] Santos C D, Zadrozny B. Learning Character-Level Representations for Part-of-Speech Tagging [C]// Proceedings of the 31st International Conference on Machine Learning, Beijing. Association for Computational Linguistics, 2014: 1818-1826.
[14] Rei M, Crichton G K O, Pyysalo S. Attending to Characters in Neural Sequence Labeling Models [C]// Proceedings of the 26th International Conference on Computational Linguistics, Osaka, Japan. Association for Computational Linguistics, 2016: 309-318.
[15] 赵洪, 王芳 . 理论术语抽取的深度学习模型及自训练算法研究[J]. 情报学报, 2018,37(9):923-938.
[15] ( Zhao Hong, Wang Fang . A Deep Learning Model and Self-Training Algorithm for Theoretical Terms Extraction[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(9):923-938.)
[16] Zhang Z Y, Han X, Liu Z Y , et al. ERNIE: Enhanced Language Representation with Informative Entities[OL]. arXiv Preprint. arXiv: 1905. 07129.
[17] Shen Y Y, Yun H, Lipton Z C, et al. Deep Active Learning for Named Entity Recognition [C]// Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada. Association for Computational Linguistics, 2017: 252-256.
[18] Ye Z X, Ling Z H . Hybrid Semi-Markov CRF for Neural Sequence Labeling[OL]. arXiv Preprint. arXiv: 1805. 03838.
[19] Devlin J, Chang M W, Lee K , et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint. arXiv: 1810. 04805.
[20] Bikel D M, Miller S, Schwartz R, et al. Nymble: A High-Performance Learning Name-finder [C]// Proceedings of the 5th Conference on Applied Natural Language Processing, Washington. Association for Computational Linguistics, 1997: 194-201.
[21] Lafferty J, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data [C]// Proceedings of the 18th International Conference on Machine Learning, Williamstown, USA. Morgan Kaufmann Publishers Inc, 2001: 282-289.
[22] Ma C, Zheng H F, Xie P, et al. DM_NLP at SemEval-2018 Task 8: Neural Sequence Labeling with Linguistic Features [C]// Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, USA. Association for Computational Linguistics, 2018: 707-711.
[23] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. Association for Computational Linguistics, 2014: 1532-1543.
[24] Lee D H. Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks [C]// Proceedings of the 30th International Conference on Machine Learning, Atlanta, USA. 2013.
[25] Li Z, Ko B S, Choi H J . Naive Semi-supervised Deep Learning Using Pseudo-label[J]. Peer-to-Peer Networking and Applications, 2019,12(5):1358-1368.
[26] Dempster A P, Larird N M, Rubin D B . Maximum Likelihood from Incomplete Data via the EM Algorithm[J]. Journal of Royal Statistical Society: Series B, 1977,39(1):1-38.
[27] Radev D R, Muthukrishnan P, Qazinian V , et al. The ACL Anthology Network Corpus[J]. Language Resources and Evaluation, 2013,47(4):919-944.
[28] Manning C, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit [C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, USA. Association for Computational Linguistics, 2014: 55-60.
[29] Sang E F, De Meulder F . Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition[OL]. arXiv Preprint. arXiv: 0306050.
[30] IEEE Thesaurus [EB/OL]. [2019-07-12]..
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[3] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[4] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[5] Lin Kerou,Wang Hao,Gong Lijuan,Zhang Baolong. Disambiguation of Chinese Author Names with Multiple Features[J]. 数据分析与知识发现, 2021, 5(4): 90-102.
[6] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[7] Li Junlian,Wu Yingjie,Deng Panpan,Leng Fuhai. Automatic Data Processing Strategy of Citation Anomie Based on Feature Fusion[J]. 数据分析与知识发现, 2020, 4(5): 38-45.
[8] Qi Ruihua,Jian Yue,Guo Xu,Guan Jinghua,Yang Mingxin. Sentiment Analysis of Cross-Domain Product Reviews Based on Feature Fusion and Attention Mechanism[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
[9] Yu Chuanming,Gong Yutian,Zhao Xiaoli,An Lu. Collaboration Recommendation of Finance Research Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938