|
|
System Analysis and Design for Methodological Entities Extraction in Full Text of Academic Literature |
Hao Xu1,Xuefang Zhu2(),Chengzhi Zhang3,Chuan Jiang4 |
1School of Economics & Management, Nanjing Institute of Technology, Nanjing 211167, China 2School of Information Management, Nanjing University, Nanjing 210023, China 3School of Economics & Management, Nanjing University of Science and Technology, Nanjing 210094, China 4 College of Information Science & Technology, Nanjing Agricultural University, Nanjing 210095, China |
|
|
Abstract [Objective] This paper proposes a new system to extract methodological entities from the full texts of academic literature, aiming to identify their indexing features and usages. [Methods] Firstly, we extracted feature sentences and methodological entities based on dictionaries, rules, and manual annotations. Then, we implemented a methodology knowledge extraction module with the help of Microsoft Visual Studio 2012 and SQL Server 2012. [Results] The precision of extracting methodological features was 76%, while the recall rate was greater than 42%. Each feature sentence had 1.42 method entities on average. The formal indexing ratio for methodological entities was less than 27%, while the ratio for feature sentences was less than 35%. We also found low formal indexing rate for subject-specific methodological entities. [Limitations] This system’s recall and precision rates were not very satisfactory. The manual workload was intensive for entity extraction and did not include the semantic features. [Conclusions] The proposed method has inter-disciplinary versatility and helps us explore the dissemination routes of interdisciplinary knowledge.
|
Received: 15 January 2019
Published: 25 November 2019
|
|
Corresponding Authors:
Xuefang Zhu
E-mail: xfzhu@nju.edu.cn
|
[1] |
崔明, 潘雪莲, 华薇娜 . 我国图书情报领域的软件使用和引用研究[J]. 中国图书馆学报, 2018,44(3):68-78.
|
[1] |
( Cui Ming, Pan Xuelian, Hua Weina . Software Usage and Citation in the Field of Library and Information Science in China[J]. Journal of Library Science in China, 2018,44(3):68-78.)
|
[2] |
Hafer L, Kirkpatrick A E . Assessing Open Source Software as a Scholarly Contribution[J]. Communications of the ACM, 2009,52(12):126-129.
|
[3] |
Piwowar H . Altmetrics: Value All Research Products[J]. Nature, 2013,493(7431):159.
|
[4] |
Research Excellence Framework. Output Information Requirements[EB/OL]. [ 2018- 11- 18]. .
|
[5] |
孙建军, 裴雷, 蒋婷 . 面向学科领域的学术文献语义标注框架研究[J]. 情报学报, 2018,37(11):1077-1086.
|
[5] |
( Sun Jianjun, Pei Lei, Jiang Ting . Research on Semantic Annotation in Academic Literature[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(11):1077-1086.)
|
[6] |
王佳敏, 李信, 刘齐进 . 全文本文献计量分析学术沙龙综述[J]. 信息资源管理学报, 2018,8(4):119-125.
|
[6] |
( Wang Jiamin, Li Xin, Liu Qijin . A Review of the Academic Salon on Full-text Bibliometric Analysis[J]. Journal of Information Resources Management, 2018,8(4):119-125.)
|
[7] |
Gupta S, Manning C D . Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers [C]// Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011: 1-9.
|
[8] |
Kondo T, Nanba H, Takezawa T , et al. Technical Trend Analysis by Analyzing Research Papers’ Titles [C]// Proceedings of the 4th Language and Technology Conference. 2009: 512-521.
|
[9] |
化柏林 . 针对中文学术文献的情报方法术语抽取[J]. 现代图书情报技术, 2013(6):68-75.
|
[9] |
( Hua Bolin . Extracting Information Method Term from Chinese Academic Literature[J]. New Technology of Library and Information Service, 2013(6):68-75.)
|
[10] |
Girju R, Beamer B, Rozovskaya A , et al. A Knowledge-Rich Approach to Identifying Semantic Relations Between Nominals[J]. Information Processing & Management, 2010,46(5):589-610.
|
[11] |
Pan X, Yan E, Wang Q , et al. Assessing the Impact of Software on Science: A Bootstrapped Learning of Software Entities in Full-Text Papers[J]. Journal of Informetrics, 2015,9(4):860-871.
|
[12] |
Nanba H, Kondo T, Takezawa T . Automatic Creation of a Technical Trend Map from Research Papers and Patents [C]// Proceedings of the 3rd International Workshop on Patent Information Retrieval. ACM, 2010: 11-16.
|
[13] |
Tsai C T, Kundu G, Roth D . Concept-Based Analysis of Scientific Literature [C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. ACM, 2013: 1733-1738.
|
[14] |
Houngbo H, Mercer R E . Method Mention Extraction from Scientific Research Papers [C]// Proceedings of the 2012 International Conference on Computational Linguistics. 2012: 1211-1222.
|
[15] |
Guo Y, Silins I, Stenius U , et al. Active Learning-Based Information Structure Analysis of Full Scientific Articles and Two Applications for Biomedical Literature Review[J]. Bioinformatics, 2013,29(11):1440-1447.
|
[16] |
钱力, 张晓林, 王茜 . 科技论文的研究设计指纹自动识别方法构建与实现[J]. 图书情报工作, 2018,62(2):135-143.
|
[16] |
( Qian Li, Zhang Xiaolin, Wang Qian . Building and Implement on Automatic Identification Method of Research Design Fingerprint of Scientific Papers[J]. Library and Information Service, 2018,62(2):135-143.)
|
[17] |
程齐凯 . 学术文本的词汇功能识别[D]. 武汉: 武汉大学, 2015.
|
[17] |
( Cheng Qikai . Term Function Recognition from Academic Text[D]. Wuhan: Wuhan University, 2015.)
|
[18] |
李信, 程齐凯, 刘兴帮 . 基于词汇功能识别的科研文献分析系统设计与实现[J]. 图书情报工作, 2017,61(1):109-116.
|
[18] |
( Li Xin, Cheng Qikai, Liu Xingbang . Design and Implementation of Scientific Literature Analysis System Based on Term Function Recognition[J]. Library and Information Service, 2017,61(1):109-116.)
|
[19] |
Pettigrew K E, McKechnie L E F . The Use of Theory in Information Science Research[J]. Journal of the American Society for Information Science and Technology, 2001,52(1):62-73.
|
[20] |
王芳, 陈锋, 祝娜 , 等. 我国情报学理论的来源、应用及学科专属度研究[J]. 情报学报, 2016,35(11):1148-1164.
|
[20] |
( Wang Fang, Chen Feng, Zhu Na , et al. Theories of Information Science in China: Source, Uses and Discipline Exclusive Degrees[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(11):1148-1164.)
|
[21] |
王芳, 祝娜, 翟羽佳 . 我国情报学研究中混合方法的应用及其领域分布分析[J]. 情报学报, 2017,36(11):1119-1129.
|
[21] |
( Wang Fang, Zhu Na, Zhai Yujia . Application of Mixed Methods and Their Field Distribution in Information Science Research in China[J]. Journal of the China Society for Scientific and Technical Information, 2017,36(11):1119-1129.)
|
[22] |
徐浩, 钱爱兵, 朱学芳 , 等. 科学知识图谱绘制工具CiteSpace的学科领域扩散特征研究[J]. 情报杂志, 2017,36(5):69-74, 68.
|
[22] |
( Xu Hao, Qian Aibing, Zhu Xuefang , et al. Discipline Diffusion Features of the Mapping Knowledge Domains Software: CiteSpace[J]. Journal of Intelligence, 2017,36(5):69-74,68.)
|
[23] |
JATS数据标准[EB/OL]. [ 2018- 11- 09]. .
|
[23] |
( Journal Archiving and Interchange Tag Set[EB/OL]. [ 2018- 11- 09].
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|