Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (11): 1-12    DOI: 10.11925/infotech.2096-3467.2022.0034
Abstracting Biomedical Documents with Knowledge Enhancement
Deng Lu,Hu Po(),Li Xuanhong
School of Computer Science, Central China Normal University, Wuhan 430079, China
Download: PDF (1518 KB)   HTML ( 24
[Objective] This study proposes a new text summarization model for biomedicine research, aiming to improve the quality of their abstracts. [Methods] First, we obtained the important contents of the biomedical texts with extractive abstracting technology. Then, we combined the important contents with related knowledge base to extract the key terms and their corresponding concepts. Third, we integrated these contents and concepts to the neural network abstrcting model as background knowledge for the attention mechanism. With the help of domain knowledge, the proposed model can not only focus on the important information from the texts, but also reduce the noises occurring due to the introduction of external information. [Results] We examined the proposed model with three biomedical data sets. The average ROUGE of the proposed model’s PG-meta reached 31.06, which was 1.51 higher than the average ROUGE of the original PG model. [Limitations] We did not investigate the impacts of different knowledge acquiring methods on the effectiveness of our model. [Conclusions] The proposed model can better learn the in-depth meaning of biomedical documents and improve the quality of their abstracts.

Key wordsBiomedical Text Mining      Generative Abstract      Domain Knowledge      Knowledge Enhancement     
Received: 12 January 2022      Published: 13 January 2023
ZTFLH:  TP393  
Fund:research project of State Language Commission(YB135-149);Fundamental Research Funds for the Central Universities(CCNU20ZT012)
Deng Lu,Hu Po,Li Xuanhong. Abstracting Biomedical Documents with Knowledge Enhancement. Data Analysis and Knowledge Discovery, 2022, 6(11): 1-12.

Structure of the PG-meta Model
Examples of Knowledge Acquisition
类别 内容
Glaucoma is a leading cause of blindness within the United States and the leading cause of blindness among African-Americans. Measurement of intraocular pressure only is no longer considered adequate for screening. Recognition of risk factors and examination of the optic nerve are key strategies to identify individuals at risk. Medical and surgical treatment of glaucoma have ······
control of iop regulation resides within the aqueous outflow system of the eye ( grant , 1958 ) and iop regulation becomes abnormal in glaucoma.<q>iop is the only treatable risk factor.<q>the intrinsic outflow system abnormality inglaucoma is unknown but is described as poag······
Glaucoma:Eye disease
IOP:Intraocular pressure
POAG:Glaucoma, Primary Open Angle
Examples of the Relationship Between Knowledge Acquisition and Corresponding Abstracts
数据集 数据集
Full-Abs 训练集 3 200 25 825 1 072 # 10
验证集 400 24 348 903 # 9
测试集 400 24 484 936 # 9
Abs-Ti 训练集 3 514 1 477 111 9 3
验证集 439 1 439 109 9 3
测试集 439 1 466 113 9 3
BioAbsTi 训练集 24 631 1 574 118 109 4
验证集 8 210 1 584 118 109 4
测试集 8 210 1 562 117 10 4
Statistical Results of the Experimental Datasets
Experimental Results Under Different Values of d
Full-Abs Abs-Ti BioAbsTi
模型 R-1 R-2 R-L R-1 R-2 R-L R-1 R-2 R-L AVG
Lead 27.93 15.38 24.58 23.36 13.79 24.58 31.79 16.55 28.67 22.96
TextRank 27.82 14.64 24.83 24.83 13.56 20.93 33.52 17.29 30.23 23.07
PG 35.95 20.42 29.87 33.25 18.36 31.45 36.58 25.56 34.52 29.55
Keywords-PG 36.23 20.89 30.52 33.75 19.06 31.83 36.93 26.22 34.85 30.03
PG-meta(All) 36.15 20.85 30.28 33.54 18.75 31.79 36.88 25.96 34.58 29.86
BERT+聚类 32.85 18.87 28.85 27.53 14.22 23.93 35.25 21.52 32.85 26.20
BERTSum 33.93 19.86 29.18 28.70 14.56 24.34 37.60 22.81 33.65 27.18
PG-meta 37.05 21.96 33.58 34.82 20.21 32.97 37.58 26.26 35.19 31.06
Experimental Results of the Baseline Model and the PG-meta Model Involved in the Comparison on the Three Datasets
类别 文本内容
原文 ······. A recent study reported that cardiac lymphatic endothelial cells (LECs) stem from venous and non-venous origins in mice. Here, we identified Isl1-expressing progenitors as a potential non-venous origin of cardiac LECs. Genetic lineage tracing with Isl1-Cre reporter mice suggested a possible contribution from the Isl1-expressing pharyngeal mesoderm constituting the second heart field to lymphatic vessels around the cardiac outflow tract as well as to those in the facial skin and the lymph sac. Isl1(+) lineage-specific deletion of Prox1 resulted in disrupted LYVE1(+) vessel structures, indicating a Prox1-dependent mechanism in this contribution. ······
参考摘要 Isl1-expressing non-venous cell lineage contributes to cardiac lymphatic vessel development.
译文: Isl1-expressing的非静脉细胞谱系有助于心脏淋巴管发育。
Here, we identified Isl1-expressing progenitors as a potential non-venous origin of cardiac LECs.
PG模型的摘要结果 The non-venous cell lineage can help the development of cardiac lymphatic vessels.
The non-venous cell lineage of Isl1-expressing promotes the development of cardiac lymphatic vessels.
译文: Isl1-expression的非静脉细胞谱系促进心脏淋巴管的发育。
Summary Results Automatically Generated by the Three Models
数据集 指标 PG PG-meta
PG-meta(TR) PG-meta(BS)
Full-Abs R-1 35.95 36.89 36.97 37.05
R-2 20.42 21.88 21.97 21.96
R-L 29.87 32.95 33.56 33.58
Abs-Ti R-1 33.25 34.67 34.85 34.82
R-2 18.36 19.66 19.35 20.21
R-L 31.45 32.58 32.73 32.97
BioAbsTi R-1 36.58 37.42 37.55 37.58
R-2 25.56 25.93 26.13 26.26
R-L 34.52 34.97 35.16 35.19
AVG 29.55 30.77 30.91 31.06
Experimental Results Under Different Important Content Extraction Methods
数据集 指标 PG-meta(term) PG-meta(con) PG-meta(t-c)
Full-Abs R-1 36.53 36.75 37.05
R-2 21.85 21.72 21.96
R-L 33.24 33.35 33.58
Abs-Ti R-1 34.63 34.79 34.82
R-2 20.19 20.25 20.21
R-L 32.73 32.79 32.97
BioAbsTi R-1 37.46 37.59 37.58
R-2 26.07 26.12 26.26
R-L 35.07 35.03 35.19
AVG 30.86 30.93 31.06
Experimental Results Under Different Knowledge Correlation Granularities
Experimental Results of Different Knowledge Fusion Methods
