|
|
Identifying Actual Value of Numerical Indicator from Scientific Paper |
Guo Shaoqing1,2, Le Xiaoqiu1( ) |
1(National Science Library, Chinese Academy of Sciences, Beijing 100190, China) 2 (University of Chinese Academy of Sciences, Beijing 100049, China) |
|
|
Abstract [Objective] This paper aims to identify the actual value of numerical indicators from the scientific literatures. [Methods] Firstly, we analyzed the Shortest-Path-Tree between the indicator and the digital entities. Then, we used by distant supervision to learn the syntactic and description characteristics of the numerical indicator sentence. Third, we created four types of relationship templates of “more than”, “less than”, “equal” and “times”. Finally, we obtained the real value of these indicators. [Results] We examined the proposed method in the fields of climate changes and astronomy. The F-values were 82.35% and 77.55%, which were above the average of related studies. [Limitations] We did not investigate the indicator real value across multiple sentences. [Conclusions] The proposed method could help us obtain the actual value of numerical indicators effectively.
|
Received: 05 November 2017
Published: 05 February 2018
|
|
[1] |
Maiya A S, Visser D, Wan A.Mining Measured Information from Text[C]//Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile. New York, USA: ACM, 2015.
|
[2] |
Santos A, Nogueira R, Lourenco A.Applying a Text Mining Framework to the Extraction of Numerical Parameters from Scientific Literature in the Biotechnology Domain[J]. Advances in Distributed Computing & Artificial Intelligence Journal, 2012(S1): 1-8.
doi: 10.14201/ADCAIJ20121118
|
[3] |
毋菲. 数值信息的抽取方法研究[D]. 太原: 山西大学, 2010.
|
[3] |
(Wu Fei.Research on Value Extraction from Chinese Text[D]. Taiyuan: Shanxi University, 2010.)
|
[4] |
Sarker A.Automated Extraction of Number of Subjects in Randomised Controlled Trials[L]. ArXiv Preprint, arXiv: 1606.07137.
|
[5] |
Sarath P R, Mandhan S, Niwa Y.Numerical Atrribute Extraction from Clinical Texts[L]. ArXiv Preprint, arXiv: 1602.00269.
|
[6] |
Murata M, Ma Q, Torisawa K, et al.Extraction and Visualization of Numerical and Named Entity Information from a Large Number of Documents[C]//Proceedings of the 2008 International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China. New York, USA: IEEE, 2009:1-8.
|
[7] |
杨少华, 林海略, 韩燕波. 针对模板生成网页的一种数据自动抽取方法[J]. 软件学报, 2008, 19(2): 209-223.
doi: 10.3724/SP.J.1001.2008.00209
|
[7] |
(Yang Shaohua, Lin Hailue, Yanbo. Automatic Data Extraction from Template- Generated Web Pages[J]. Journal of Software, 2008, 19(2): 209-223.)
doi: 10.3724/SP.J.1001.2008.00209
|
[8] |
Madaan A, Mittal A, Ramakrishnan G, et al.Numerical Relation Extraction with Minimal Supervision[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence.USA: AAAI Press, 2016: 2764-2771.
|
[9] |
吴胜, 刘茂福, 胡慧君, 等. 中文文本中实体数值型关系无监督抽取方法[J]. 武汉大学学报:理学版, 2016, 62(6): 552-560.
doi: 10.14188/j.1671-8836.2016.06.011
|
[9] |
(Wu Sheng, Liu Maofu, Hu Huijun, et al.Unsupervised Extraction of Attribute-Value Entity Relation from Chinese Texts[J]. Journal of Wuhan University: National Science Edition, 2016, 62(6): 552-560.)
doi: 10.14188/j.1671-8836.2016.06.011
|
[10] |
Lee T, Wang Z, Wang H, et al.Attribute Extraction and Scoring: A Probabilistic Approach[C]//Proceedings of the 29th International Conference on Data Engineering, Brisbane, QLD, Australia. USA: IEEE, 2013: 194-205.
|
[11] |
Davidov D, Rappoport A.Extraction and Approximation of Numerical Attributes from the Web[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 1308-1317.
|
[12] |
Chaganty A T, Liang P.How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions[C] //Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 578-587.
|
[13] |
Mintz M, Bills S, Snow R, et al.Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics. 2009.
|
[14] |
Aho A V.Efficient String Matching: An Aid to Bibliographic Search[J]. Communications of the ACM, 1975, 18(6): 333-340.
doi: 10.1145/360825.360855
|
[15] |
Zhang M, Zhang J, Su J.Exploring Syntactic Features for Relation Extraction Using a Convolution Tree Kernel[C]// Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006.
|
[16] |
Jindal N, Liu B.Identifying Comparative Sentences in Text Documents[C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2006: 244-251.
|
[17] |
Maguire A J, Kolian M, Rosseel K, et al.Climate Change Indicators in the United States [EB/OL]. [2017-09-11]..
|
[18] |
Manning C D, Surdeanu M, Bauer J, et al.The Stanford CoreNLP Natural Language Processing Toolkit[C]// Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations.2014.
|
[19] |
吴超, 郑彦宁, 化柏林. 数值信息抽取研究进展综述[J]. 中国图书馆学报, 2014, 40(2): 107-119.
|
[19] |
(Wu Chao, Zheng Yanning, Hua Bolin.Numerical Information Extraction: A Review of Research[J]. Journal of Library Science in China, 2014, 40(2): 107-119.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|