Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (1): 21-28    DOI: 10.11925/infotech.2096-3467.2017.1091
Identifying Actual Value of Numerical Indicator from Scientific Paper
Guo Shaoqing1,2, Le Xiaoqiu1()
1(National Science Library, Chinese Academy of Sciences, Beijing 100190, China)
2 (University of Chinese Academy of Sciences, Beijing 100049, China)
[Objective] This paper aims to identify the actual value of numerical indicators from the scientific literatures. [Methods] Firstly, we analyzed the Shortest-Path-Tree between the indicator and the digital entities. Then, we used by distant supervision to learn the syntactic and description characteristics of the numerical indicator sentence. Third, we created four types of relationship templates of “more than”, “less than”, “equal” and “times”. Finally, we obtained the real value of these indicators. [Results] We examined the proposed method in the fields of climate changes and astronomy. The F-values were 82.35% and 77.55%, which were above the average of related studies. [Limitations] We did not investigate the indicator real value across multiple sentences. [Conclusions] The proposed method could help us obtain the actual value of numerical indicators effectively.

Key wordsNumerical Indicator      Actual Value      Template Recognition      Distant Supervision     
Received: 05 November 2017      Published: 05 February 2018
ZTFLH:  G250.76  

Guo Shaoqing,Le Xiaoqiu. Identifying Actual Value of Numerical Indicator from Scientific Paper. Data Analysis and Knowledge Discovery, 2018, 2(1): 21-28.

等于关系 …annual precipitation measured in this study is 734mm…
大于关系 …temperature has risen by about 5 ℃ above yesterday…
小于关系 …CO2 concentration is 5% lower than the PM10 concentration…
倍数关系 …capacity of this bottle is 2/3 of the other one…
JJR(词性) BRB(词性) as…as(词组) Of NN(词性+词组)
Above Over Below Under
Twice Thrice Half More
Before Behind Ahead ……
类型 取值关系 换算关系
大于类型 %、times等倍数单位 Baseline entity × ( 1 + value unit )
其他单位 ( Baseline entity + value ) unit
小于类型 %、times等倍数单位 Baseline entity × ( 1 - value unit )
其他单位 ( Baseline entity - value ) unit
倍数/分数类型 所有单位 Baseline entity × value [%]
等于类型 所有单位 Value unit
指标 单位 指标 单位
Mass median diameter mm Survival rate %
Vechicle speed kmh-1 Total weight kg
Scattering angle ° ……
模板 频次 支持度 模板 频次 支持度
NN|NP|PP[of]|NP|CD 1591 65.61% NN|NP|VP[be]|PP|NP|CD 766 53.75%
NN|NP|PP[between]|NP|CD 228 9.41% NN|NP|VP|PP[from]|NP|CD 510 35.79%
流程 正确率 召回率 F值
(1) 原识别流程 78.15% 81.21% 79.11%
(2) 将子句判断加入(1)中 85.31% 75.62% 80.18%
(3) 将常用模板加入 (1)(2)中 84.01% 80.76% 82.35%
