Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (1): 134-144    DOI: 10.11925/infotech.2096-3467.2021.0612
Current Issue | Archive | Adv Search |
Disease Knowledge Discovery Based on SPO Predications
Cai Miaozhi,Li Xiaoying,Zhao Jiawei,Feng Fengxiang,Ren Huiling()
Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100020, China
Download: PDF (1238 KB)   HTML ( 30
Export: BibTeX | EndNote (RIS)      

[Objective] This study tries to discover knowledge from the high-level evidence-based literature on diseases indexed by PubMed, aiming to provide reference for clinical diagnosis, treatment, as well as routine prevention and control of diseases. [Methods] We proposed a diseases knowledge discovery model based on SPO predications with the semantic extraction tool SemRep. Then we selected the diabetes-related literature to evaluate this model, and discovered knowledge based on SPO visualization and clinical knowledge. [Results] We obtained 1 258 SPO predications and 16 semantic relationships, which identified diabetes-related genes, common complications, as well as detection and treatment methods. [Limitations] We only examined our model with publicly accessible literature. More research is needed to include knowledge bases and electronic medical records. [Conclusions] The disease knowledge discovery model based on SPO predication could identify the biomedical knowledge from literature, which provides potential research hypotheses and ideas for biomedical researchers.

Key wordsSPO      Diabetes Mellitus      Knowledge Discovery      Knowledge Organization     
Received: 21 June 2021      Published: 22 February 2022
ZTFLH:  G250  
Fund:National Key Research and Development Program of China(2019AAA0104901);National Social Science Fund of China(20BTQ062);China-WHO Biennial Collaborative Projects(GJ2-2021-WHOSO-01)
Corresponding Authors: Ren Huiling,ORCID:0000-0002-1067-408X     E-mail:

Cite this article:

Cai Miaozhi, Li Xiaoying, Zhao Jiawei, Feng Fengxiang, Ren Huiling. Disease Knowledge Discovery Based on SPO Predications. Data Analysis and Knowledge Discovery, 2022, 6(1): 134-144.

URL:     OR

Diseases Knowledge Discovery Model
SemRep Output Example
类型 语义关系 语义模式示例 三元组示例
诊断治疗 TREATS phsu-TREATS-dsyn Metformin-TREATS-Diabetes Mellitus, Non-Insulin-Dependent
topp-TREATS-dsyn Interventional procedure-TREATS-Diabetes Mellitus, Non-Insulin-
horm-TREATS-dsyn Insulin-TREATS-Diabetes Mellitus, Non-Insulin-Dependent
DIAGNOSES diap-DIAGNOSES-dsyn Oral Glucose Tolerance Test-DIAGNOSES-Diabetes
lbpr-DIAGNOSES-dsyn Glucose tolerance test-DIAGNOSES-Gestational Diabetes
PREVENTS dora-PREVENTS-dsyn Exercise-PREVENTS-Gestational Diabetes
phsu-PREVENTS-dsyn Metformin-PREVENTS-Diabetes
相关疾病 PRECEDES dsyn-PRECEDES-dsyn Myocardial Infarction-PRECEDES-Diabetes
COEXISTS_WITH dsyn-COEXISTS_WITH-dsyn Hypoglycemia-COEXISTS_WITH-Diabetes Mellitus, Insulin-Dependent
patf-COEXISTS_WITH-dsyn Insulin Resistance-COEXISTS_WITH-Diabetes Mellitus, Non-Insulin-Dependent
疾病特征 LOCATION_OF bpoc-LOCATION_OF-dsyn Eye-LOCATION_OF-Diabetic macular edema
ISA dsyn-ISA-dsyn Diabetes Mellitus, Non-Insulin-Dependent-ISA-Metabolic Diseases
影响/关联因素 CAUSES dsyn-CAUSES-dsyn Diabetic Nephropathy-CAUSES-Kidney Failure, Chronic
patf-CAUSES-dsyn Insulin Resistance-CAUSES-Diabetes Mellitus, Non-Insulin-Dependent
AFFECTS orch-AFFECTS-dsyn Blood Glucose-AFFECTS-Diabetes Mellitus, Insulin-Dependent
PREDISPOSES dsyn-PREDISPOSES-dsyn Diabetes Mellitus, Non-Insulin-Dependent-PREDISPOSES-
Cardiovascular Diseases
ASSOCIATED_WITH aapp-ASSOCIATED_WITH-dsyn Insulin-ASSOCIATED_WITH-Diabetes Mellitus, Insulin-Dependent
gngm-ASSOCIATED_WITH-dsyn IMPACT gene-ASSOCIATED_WITH-Diabetes Mellitus, Non-Insulin-Dependent
药理作用 AUGMENTS aapp-AUGMENTS-celf Insulin-AUGMENTS-glucose uptake
INHIBITS phsu-INHIBITS-bacs canagliflozin-INHIBITS-Glucose
DISRUPTS aapp-DISRUPTS-dsyn ranibizumab-DISRUPTS-Diabetic macular edema
INTERACTS_WITH aapp-INTERACTS_WITH-orch CD69 protein, human-INTERACTS_WITH-Blood Glucose
The Semantic Relationship and Semantic Pattern of Diabetes Mellitus SPO
SPO Visualization
类型 S P O 出现频次
基因 SLC5A2 gene ASSOCIATED_WITH Diabetes Mellitus, Non-Insulin-Dependent 5
HSD11B1 wt Allele ASSOCIATED_WITH Diabetes Mellitus, Non-Insulin-Dependent 3
FABP4 gene ASSOCIATED_WITH Insulin Resistance 3
并发症 Hypoglycemia COEXISTS_WITH Diabetes Mellitus, Insulin-Dependent 40
Cardiovascular Diseases COEXISTS_WITH Diabetes Mellitus, Non-Insulin-Dependent 20
Diabetic Nephropathy ISA Complication 18
Diabetic Foot ISA Complication 16
Diabetic Retinopathy ISA Complication 12
检测手段 Body mass index procedure DIAGNOSES Diabetes Mellitus, Non-Insulin-Dependent 13
Oral Glucose Tolerance Test DIAGNOSES Diabetes 13
治疗 Metformin TREATS Diabetes Mellitus, Non-Insulin-Dependent 338
Insulin TREATS Diabetes Mellitus, Non-Insulin-Dependent 202
sitagliptin TREATS Diabetes Mellitus, Non-Insulin-Dependent 130
liraglutide TREATS Diabetes Mellitus, Non-Insulin-Dependent 97
dapagliflozin TREATS Diabetes Mellitus, Non-Insulin-Dependent 67
pioglitazone TREATS Diabetes Mellitus, Non-Insulin-Dependent 66
canagliflozin TREATS Diabetes Mellitus, Non-Insulin-Dependent 61
exenatide TREATS Diabetes Mellitus, Non-Insulin-Dependent 59
empagliflozin TREATS Diabetes Mellitus, Non-Insulin-Dependent 52
Exercise TREATS Diabetes Mellitus, Non-Insulin-Dependent 101
Exercise Training TREATS Diabetes Mellitus, Non-Insulin-Dependent 28
High-Intensity Interval Training TREATS Diabetes Mellitus, Non-Insulin-Dependent 13
Diet, Carbohydrate-Restricted TREATS Diabetes Mellitus, Non-Insulin-Dependent 13
Very low energy diet TREATS Diabetes Mellitus, Non-Insulin-Dependent 13
Diet, High-Protein TREATS Diabetes Mellitus, Non-Insulin-Dependent 5
Diet, Mediterranean TREATS Diabetes Mellitus, Non-Insulin-Dependent 3
diabetes education ISA Self-Management 68
Resistance education TREATS Diabetes Mellitus, Non-Insulin-Dependent 26
Examples of diabetic mellitus SPO
[1] National Library of Medicine. PubMed Overview[EB/OL].[2021-05-04]. .
[2] Pyysalo S, Baker S, Ali I, et al. LION LBD: A Literature-Based Discovery System for Cancer Biology[J]. Bioinformatics, 2019, 35(9):1553-1561.
doi: 10.1093/bioinformatics/bty845 pmid: 30304355
[3] 隗玲, 胡正银, 庞弘燊, 等. 基于“主语-谓语-宾语”三元组的知识发现研究——以诱导多能干细胞领域为例[J]. 数字图书馆论坛, 2017(9):28-34.
[3] ( Wei Ling, Hu Zhengyin, Pang Hongshen, et al. Study on Knowledge Discovery Based on “Subject-Predication-Object” Predications: A Case Study of Induced Pluripotent Stem Cells[J]. Digital Library Forum, 2017(9):28-34.)
[4] Liu Y, Bill R, Fiszman M, et al. Using SemRep to Label Semantic Relations Extracted from Clinical Text[J]. Proceedings of the AMIA Annual Fall Symposium, 2012: 587-595.
[5] World Health Organization. Noncommunicable Diseases[EB/OL].(2018-06-1). [2021-05-06]. .
[6] Harding J L, Pavkov M E, Magliano D J, et al. Global Trends in Diabetes Complications: A Review of Current Evidence[J]. Diabetologia, 2019, 62(1):3-16.
doi: 10.1007/s00125-018-4711-2
[7] 国家卫生健康委员会. 健康中国行动(2019—2030年)[EB/OL].(2019-07-15). [2021-05-06]. .
[7] (National Health Commission. Healthy China Action(2019-2030)[EB/OL].(2019-07-15). [2021-05-06]. .
[8] Gopalakrishnan V, Jha K, Jin W, et al. A Survey on Literature Based Discovery Approaches in Biomedical Domain[J]. Journal of Biomedical Informatics, 2019, 93:103141.
doi: S1532-0464(19)30059-0 pmid: 30857950
[9] 代冰, 胡正银. 基于文献的知识发现新近研究综述[J]. 数据分析与知识发现, 2021, 5(4):1-12.
[9] ( Dai Bing, Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. Data Analysis and Knowledge Discovery, 2021, 5(4):1-12.)
[10] Swanson D R, Smalheiser N R. An Interactive System for Finding Complementary Literatures: A Stimulus to Scientific Discovery[J]. Artificial Intelligence, 1997, 91(2):183-203.
doi: 10.1016/S0004-3702(97)00008-8
[11] Cohen T, Widdows D, Stephan C, et al. Predicting High-throughput Screening Results with Scalable Literature-based Discovery Methods[J]. CPT: Pharmacometrics & Systems Pharmacology, 2014, 3(10):e140.
[12] 贺丹, 姜淼, 郑光, 等. 利用文本挖掘技术探索高血压病症状、证候以及用药规律[J]. 中国实验方剂学杂志, 2014, 20(19):214-216.
[12] ( He Dan, Jiang Miao, Zheng Guang, et al. Exploring Relationship Among Symptom, Pattern and Medication Regularityof Hypertension Based on Text Mining Technology[J]. Chinese Journal of Experimental Traditional Medical Formulae, 2014, 20(19):214-216.)
[13] 胡正银, 刘蕾蕾, 代冰, 等. 基于领域知识图谱的生命医学学科知识发现探析[J]. 数据分析与知识发现, 2020, 4(11):1-14.
[13] ( Hu Zhengyin, Liu Leilei, Dai Bing, et al. Discovering Subject Knowledge in Life and Medical Sciences with Knowledge Graph[J]. Data Analysis and Knowledge Discovery, 2020, 4(11):1-14.)
[14] Xu B, Shi X, Zhao Z, et al. Leveraging Biomedical Resources in Bi-LSTM for Drug Drug Interaction Extraction[J]. IEEE Access, 2018, 6:33432-33439.
doi: 10.1109/ACCESS.2018.2845840
[15] 李智恒, 桂颖溢, 杨志豪, 等. 基于生物医学文献的化学物质致病关系抽取[J]. 计算机研究与发展, 2018, 55(1):198-206.
[15] ( Li Zhiheng, Gui Yingyi, Yang Zhihao, et al. Chemical-Induced Disease Relation Extraction Based on Biomedical Literature[J]. Journal of Computer Research and Development, 2018, 55(1):198-206.)
[16] 李晓瑛, 李军莲, 李丹亚. 一体化医学语言系统及其在知识发现中的应用研究[J]. 数字图书馆论坛, 2019(9):24-29.
[16] ( Li Xiaoying, Li Junlian, Li Danya. Research on the Unified Medical Language System and Its Application to Knowledge Discovery[J].Digital Library Forum, 2019(9):24-29.)
[17] Fiszman M, Rindflesch T C, Kilicoglu H. Summarizing Drug Information in Medline Citations[J]. Proceedings of the AMIA Annual Fall Symposium, 2006: 254-258.
[18] Kilicoglu H, Fiszman M, Rodriguez R, et al. Semantic MEDLINE: A Web Application for Managing the Results of PubMed Searches[C]// Proceedings of the 3rd International Symposium on Semantic Mining in Biomedicine. 2008: 69-76.
[19] Zhang H, Fiszman M, Shin D, et al. Clustering Cliques for Graph-based Summarization of the Biomedical Research Literature[J]. BMC Bioinformatics, 2013, 14(1):Article No. 182.
doi: 10.1186/1471-2105-14-182
[20] 闫雷, 刘春鹤, 关晶, 等. SemRep处理结果统计挖掘系统的开发[J]. 医学信息学杂志, 2013, 34(4):31-34.
[20] ( Yan Lei, Liu Chunhe, Guan Jing, et al. The Development of Statistics Mining System Based on Result Analysis by Application of SemRep[J]. Journal of Medical Informatics, 2013, 34(4):31-34.)
[21] 王雪, 杨雪梅, 李沛鑫, 等. 基于语义模型的药物矛盾知识发现[J]. 情报杂志, 2020, 39(7):159-165.
[21] ( Wang Xue, Yang Xuemei, Li Peixin, et al. Contradiction Knowledge Discovery of Drugs Based on Semantic Model[J]. Journal of Intelligence, 2020, 39(7):159-165.)
[22] Fiszman M, Rindflesch T C, Kilicoglu H. Abstraction Summarization for Managing the Biomedical Research Literature[C]// Proceedings of the HLT/NAACL 2004 Workshop on Computational Lexical Semantics. USA: Association for Computational Linguistics, 2004: 76-83.
[23] Lundkvist P, Pereira MJ, Kamble PG, et al. Glucagon Levels During Short-Term SGLT2 Inhibition are Largely Regulated by Glucose Changes in Patients with Type 2 Diabetes[J]. Journal of Clinical Endocrinology and Metabolism, 2019, 104(1):193-201.
doi: 10.1210/jc.2018-00969
[24] Stomby A, Otten J, Ryberg M, et al. Diet-induced Weight Loss Alters Hepatic Glucocorticoid Metabolism in Type 2 Diabetes Mellitus[J]. European Journal of Endocrinology, 2020, 182(4):447-457.
doi: 10.1530/EJE-19-0901
[25] Furuhashi M, Hiramitsu S, Mita T, et al. Reduction of Serum FABP4 Level by Sitagliptin, a DPP-4 Inhibitor, in Patients with Type 2 Diabetes Mellitus[J]. Journal of Lipid Research, 2015, 56(12):2372-2380.
doi: 10.1194/jlr.M059469 pmid: 26467280
[26] 殷雨晨. 糖尿病并发症(Ⅰ)[J]. 中国伤残医学, 2020, 28(23):I0003.
[26] (Yin Yuchen. Diabetes Complications (Ⅰ)[J]. Chinese Journal of Trauma and Disability Medicine, 2020, 28(23):I0003.)
[27] Henriksen M M, Andersen H U, Thorsteinsson B, et al. Asymptomatic Hypoglycaemia in Type 1 Diabetes: Incidence and Risk Factors[J]. Diabetic Medicine, 2019, 36(1):62-69.
doi: 10.1111/dme.13848
[28] 中华医学会糖尿病学分会. 2020年2型糖尿病防治指南[EB/OL].[2021-05-15]. .
[28] (Chinese Diabetes Society. 2020 Type 2 Diabetes Prevention Guidelines[EB/OL].[2021-05-15]. .)
[29] 王永胜, 杨丽霞, 程涛, 等. 糖尿病肾病的炎症致病机制与中药防治[J]. 中国实验方剂学杂志, 2018, 24(2):200-207.
[29] ( Wang Yongsheng, Yang Lixia, Cheng Tao, et al. Pathogenic Mechanism of Inflammation and TCM Intervention of Diabetic Nephropathy[J]. Chinese Journal of Experimental Traditional Medical Formulae, 2018, 24(2):200-207.)
[30] Weng J P, Bi Y. Epidemiological Status of Chronic Diabetic Complications in China[J]. Chinese Medical Journal, 2015, 128(24):3267-3269.
doi: 10.4103/0366-6999.171350
[31] 科普中国. 口服葡萄糖耐量试验[EB/OL]. [2021-05-15].口服葡萄糖耐量试验/10729512?fr=aladdin .
[31] (China Science Communication. Oral Glucose Tolerance Test[EB/OL].[2021-05-15].口服葡萄糖耐量试验/10729512?fr=aladdin .)
[32] 王超. 中国成人超重和肥胖及主要危险因素对糖尿病发病的影响[D]. 北京: 北京协和医学院, 2014.
[32] ( Wang Chao. The Influence of Overweight, Obesity and Main Risk Factors on the Incidence of Diabetes in Chinese Adults[D]. Beijing: Peking Union Medical College, 2014.)
[33] Sanchez-Rangel E, Inzucchi S E. Metformin: Clinical Use in Type 2 Diabetes[J]. Diabetologia, 2017, 60(9):1586-1593.
doi: 10.1007/s00125-017-4336-x pmid: 28770321
[34] Fullerton B, Siebenhofer A, Jeitler K, et al. Short-acting Insulin Analogues Versus Regular Human Insulin for Adult, Non-pregnant Persons with Type 2 Diabetes Mellitus[J]. The Cochrane Database of Systematic Reviews, 2018, 12: CD013228.
[35] Defronzo R A, Inzucchi S, Abdul-Ghani M, et al. Pioglitazone: The Forgotten, Cost-effective Cardioprotective Drug for Type 2 Diabetes[J]. Diabetes & Vascular Disease Research, 2019, 16(2):133-143.
[36] Lee I M, Shiroma E J, Lobelo F, et al. Effect of Physical Inactivity on Major Non-Communicable Diseases Worldwide: An Analysis of Burden of Disease and Life Expectancy[J]. The Lancet, 2012, 380(9838):219-229.
doi: 10.1016/S0140-6736(12)61031-9
[37] 科普中国. 糖尿病[EB/OL]. [2021-05-15].糖尿病/100969?fr=aladdin .
[37] (China Science Communication. Diabetes Mellitus[EB/OL]. [2021-05-15].糖尿病/100969?fr=aladdin .)
[38] Kilicoglu H, Shin D, Fiszman M, et al. SemMedDB: A PubMed-scale Repository of Biomedical Semantic Predications[J]. Bioinformatics, 2012, 28(23):3158-3160.
doi: 10.1093/bioinformatics/bts591 pmid: 23044550
[1] Zhang Yujie, Bai Rujiang, Xu Haiyun, Han Jing, Zhao Mengmeng. Assisted TCM Diagnosis and Treatment for Diabetes with Multi NLP Tasks[J]. 数据分析与知识发现, 2022, 6(1): 122-133.
[2] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[3] Shen Wang, Li Shiyu, Liu Jiayu, Li He. Optimizing Quality Evaluation for Answers of Q&A Community[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[4] Wu Shengnan, Tian Ruonan, Pu Hongjun, Liang Wenqi, Zhang Yafei, Yu Qi, He Peifeng. Predicting Related Medical Topics from Social Media[J]. 数据分析与知识发现, 2021, 5(12): 98-109.
[5] Zhao Ping,Sun Lianying,Tu Shuai,Bian Jianling,Wan Ying. Identifying Scenic Spot Entities Based on Improved Knowledge Transfer[J]. 数据分析与知识发现, 2020, 4(5): 118-126.
[6] Hu Zhengyin,Liu Leilei,Dai Bing,Qin Xiaochu. Discovering Subject Knowledge in Life and Medical Sciences with Knowledge Graph[J]. 数据分析与知识发现, 2020, 4(11): 1-14.
[7] Haixia Sun,Panpan Deng,Jiao Li,Liu Shen,Qing Qian. Automatic Concept Update Strategy Towards Heterogeneous Terminology Integration[J]. 数据分析与知识发现, 2020, 4(1): 121-130.
[8] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[9] Juhua Wu,Yu Wang,Ming Li,Shaoyun Cai. Knowledge Discovery of Online Health Communities with Weighted Knowledge Network[J]. 数据分析与知识发现, 2019, 3(2): 108-117.
[10] Lei Yang,Zirun Wang,Guisheng Hou. Discovering Topics of Online Health Community with Q-LDA Model[J]. 数据分析与知识发现, 2019, 3(11): 52-59.
[11] Jiying Hu,Jing Xie,Li Qian,Changlei Fu. Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph[J]. 数据分析与知识发现, 2019, 3(1): 55-62.
[12] Wang Zhongyi,Zhang Heming,Huang Jing,Li Chunya. Studying Knowledge Dissemination of Online Q&A Community with Social Network Analysis[J]. 数据分析与知识发现, 2018, 2(11): 80-94.
[13] Wang Xin,Feng Wen’gang. Review of Techniques Detecting Online Extremism and Radicalization[J]. 数据分析与知识发现, 2018, 2(10): 2-8.
[14] Zhang Zhiqiang,Fan Shaoping,Chen Xiujuan. Biomedical Informatics Studies for Knowledge Discovery in Precision Medicine[J]. 数据分析与知识发现, 2018, 2(1): 1-8.
[15] Mu Dongmei,Wang Ping,Zhao Danning. Reducing Data Dimension of Electronic Medical Records: An Empirical Study[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938