Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 115-126    DOI: 10.11925/infotech.2096-3467.2020.1263
Current Issue | Archive | Adv Search |
Optimizing Automatic Question Answering System Based on Disease Knowledge Graph
Li He,Liu Jiayu(),Li Shiyu,Wu Di,Jin Shuaiqi
School of Management, Jilin University, Changchun 130022, China
Download: PDF (2587 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper optimizes one existing question answering system, aiming to provide a more accurate disease knowledge query tool for the public. [Methods] Based on the disease knowledge graph, we obtained the disease symptom entities with the help of AC algorithm and semantic similarity calculation. Then, we categorized users’ questions with manual annotation and AC. Finally, we encapsulated the matched words into a dictionary, which was converted to database query language to retrieve relevant answers to the questions. [Results] We examined our new system with the Chinese medical question and answering data set. It had an average accuracy of 86.0% by answering five types of questions on COVID-19, which is higher than the existing Q&A system. [Limitations] There are many missing values of data on “checkup” and “infection”, which affects the performance of our new system. [Conclusions] The optimized automatic question answering system is an effective knowledge retrieval tool for epidemic related diseases.

Key wordsKnowledge Graph      Q&A System;      COVID-19      System Optimization      Aho-Corasick     
Received: 21 December 2020      Published: 08 March 2021
ZTFLH:  G250  
Fund:The work is supported by the National Natural Science Foundation of China(Grant No)(71974075);the Graduate Innovation Fund of Jilin University(Grant No)(101832020CX060)
Corresponding Authors: Liu Jiayu     E-mail: liujiayu0426@163.com

Cite this article:

Li He,Liu Jiayu,Li Shiyu,Wu Di,Jin Shuaiqi. Optimizing Automatic Question Answering System Based on Disease Knowledge Graph. Data Analysis and Knowledge Discovery, 2021, 5(5): 115-126.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1263     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I5/115

The Framework of Disease Knowledge Graph
The Optimization Framework of QA System Based on Disease Knowledge Graph
疾病 别称 传染性 信息来源 医学专科 检查项目 症状 药物 病因
新型冠状病毒肺炎 COVID-19;
新冠肺炎
有传染性 百度百科 发热门诊;
感染科
核酸检测 恶寒发热或
无热…
莲花清瘟胶囊(颗粒)… 冠状病毒感染
Disease Data Sheet (Partial)
实体类型 中文含义 关系类型 中文含义
Disease 疾病 —— ——
Symptom 症状 HAS_SYMPTOM 有症状
Part 发病部位 PART_IS 部位是
Department 医学专科 DEPARTMENT_IS 科室是
Drug 药物 HAS_DRUG 有药品
Entity and Relation of COVID-19 Knowledge Graph
属性类型 中文含义
age 发病人群
infection 传染性
checklist 检查项目
treatment 治疗
cause 病因
resource 信息来源
alias 别称
Property of COVID-19 Knowledge Graph
COVID-19 Knowledge Graph(Partial)
Word Cloud of Question Classification
问题类别 类别特征词
科室类 什么科, 科室, 挂什么科,哪科,什么科室
症状类 症状是什么,有什么症状, 哪些现象,表现是什么,哪些征兆,有什么表现,哪些症状
病因类 发病原因是什么, 什么原因导致的, 原因是什么, 什么原因, 导致, 病因是什么,怎么造成,怎么引起
传染性 传染吗, 是传染病吗, 会不会传染, 是否传染,会不会感染
检查类 什么检查, 怎么诊断, 检查什么,化验什么, 怎么检查, 怎么确诊
Results of Manual Annotation
问题类别 准确率/% 召回率/% F值/%
科室类 91.42 96.82 94.04
症状类 95.36 61.80 74.99
病因类 98.21 93.22 95.65
传染性 97.73 91.49 94.51
检查类 93.36 93.78 93.57
Results of Question Matching
Results of Question and Answering
问题类 测试问句数/条 答案正确数/条 准确率/%
科室类 30 29 96.7
症状类 30 28 93.3
病因类 30 26 86.7
传染性 30 22 73.3
检查类 30 24 80.0
Evaluation Results of QA in COVID-19 Dataset
问题类型 测试问句数/条 答案正确数/条 准确率/%
检查类 30 27 90.0
症状类 30 25 83.3
Evaluation Results of QA in Diabetes Dataset
[1] 张志强, 张邓锁, 胡正银. 突发重大公共卫生事件应急集成知识咨询服务体系建设与实践——以新冠肺炎( COVID-19)疫情事件为例[J]. 图书与情报, 2020(2):1-12.
[1] ( Zhang Zhiqiang, Zhang Dengsuo, Hu Zhengyin. Study and Practice on Emergency Integrated Knowledge Service System for Major Public Health Emergencies——Take the Novel Coronavirus Pneumonia(COVID-19) Event as an Example[J]. Library & Information, 2020(2):1-12.)
[2] 孙素芬, 罗长寿, 魏清凤. Web农业实用技术自动问答系统设计实现[J]. 现代图书情报技术, 2009(7/8):70-74.
[2] ( Sun Sufen, Luo Changshou, Wei Qingfeng. Design and Implementation of Automatic Question-answering System About Agricultural Operative Technology Based on Web[J]. New Technology of Library and Information Service, 2009(7/8):70-74.)
[3] 郑实福, 刘挺, 秦兵, 等. 自动问答综述[J]. 中文信息学报, 2002,16(6):46-52.
[3] ( Zheng Shifu, Liu Ting, Qin Bing, et al. Overview of Question-Answering[J]. Journal of Chinese Information Processing, 2002,16(6):46-52.)
[4] 陈璟浩, 曾桢, 李纲. 基于知识图谱的“一带一路”投资问答系统构建[J]. 图书情报工作, 2020,64(12):95-105.
[4] ( Chen Jinghao, Zeng Zhen, Li Gang. A Question Answering System for “the Belt and Road” Investment Based on Knowledge Graph[J]. Library and Information Service, 2020,64(12):95-105.)
[5] 曹明宇, 李青青, 杨志豪, 等. 基于知识图谱的原发性肝癌知识问答系统[J]. 中文信息学报, 2019,33(6):88-93.
[5] ( Cao Mingyu, Li Qingqing, Yang Zhihao, et al. A Question Answering System for Primary Liver Cancer Based on Knowledge Graph[J]. Journal of Chinese Information Processing, 2019,33(6):88-93.)
[6] Zhang Y, Sheng M, Zhou R , et al. HKGB: An Inclusive, Extensible, Intelligent, Semi-auto-constructed Knowledge Graph Framework for Healthcare with Clinicians’ Expertise Incorporated[J]. Information Processing & Management, 2020,57(6):102324.
doi: 10.1016/j.ipm.2020.102324
[7] 马满福, 刘元喆, 李勇, 等. 基于LCN的医疗知识问答模型[J]. 西南大学学报(自然科学版), 2020,42(10):25-36.
[7] ( Ma Manfu, Liu Yuanzhe, Li Yong, et al. An LCN-Based Medical Knowledge Base Question Answering Model[J]. Journal of Southwest University (Natural Science Edition), 2020,42(10):25-36.)
[8] 李沛晏, 朱露, 吴多胜 . 问答系统综述[J]. 数字技术与应用, 2015( 4): 69, 71.(Li Peiyan, Zhu Lu, Wu Duosheng. An Overview of Automatic Question and Answering System[J]. Digital Technology & Application, 2015(4): 69, 71.)
[9] 张宁, 朱礼军. 中文问答系统问句分析研究综述[J]. 情报工程, 2016,2(1):32-42.
[9] ( Zhang Ning, Zhu Lijun. A Survey of Chinese QA System’s Question Analysis[J]. Technology Intelligence Engineering, 2016,2(1):32-42.)
[10] 马晨浩. 基于甲状腺知识图谱的自动问答系统的设计与实现[J]. 智能计算机与应用, 2018,8(3):102-107.
[10] ( Ma Chenhao. Design and Implementation of Automatic Question Answering System Based on Thyroid Knowledge Map[J]. Intelligent Computer and Applications, 2018,8(3):102-107.)
[11] Luo L, Yang ZH, Yang P , et al. An Attention-Based BiLSTM-CRF Approach to Document-Level Chemical Named Entity Recognition[J]. Bioinformatics, 2018,34(8):1381-1388.
doi: 10.1093/bioinformatics/btx761
[12] 陈志豪. 面向医疗健康领域的问答系统关键技术研究与实现[D]. 重庆: 重庆邮电大学, 2019.
[12] ( Chen Zhihao. Research and Implementation of Key Technologies of Question Answering System for Medical Health[D]. Chongqing: Chongqing University of Posts and Telecommunications, 2019.)
[13] 黄梦醒, 李梦龙, 韩惠蕊. 基于电子病历的实体识别和知识图谱构建的研究[J]. 计算机应用研究, 2019,36(12):3735-3739.
[13] ( Huang Mengxing, Li Menglong, Han Huirui. Research on Entity Recognition and Knowledge Graph Construction Based on Electronic Medical Records[J]. Application Research of Computers, 2019,36(12):3735-3739.)
[14] 杨文明, 褚伟杰. 在线医疗问答文本的命名实体识别[J]. 计算机系统应用, 2019,28(2):8-14.
[14] ( Yang Wenming, Chu Weijie. Named Entity Recognition of Online Medical Question Answering Text[J]. Computer Systems & Applications, 2019,28(2):8-14.)
[15] 李振鹏, 陈碧珍, 罗静宇. 基于文本挖掘的网络舆情分类研究[J]. 系统科学与数学, 2020,40(5):813-826.
[15] ( Li Zhenpeng, Chen Bizhen, Luo Jingyu. Research on Network Public Opinion Classification Based on Text Mining[J]. Journal of Systems Science and Mathematical Sciences, 2020,40(5):813-826.)
[16] 曹为政, 葛蒙蒙. 多模式匹配算法研究和优化[J]. 智能计算机与应用, 2018,8(2):129-133.
[16] ( Cao Weizheng, Ge Mengmeng. Multi-pattern Matching Algorithm Research and Optimization[J]. Intelligent Computer and Applications, 2018,8(2):129-133.)
[17] 刘羿, 冯子恩, 万晓娴. 基于知识图谱的急诊问答系统[J]. 电脑与电信, 2020(4):51-55.
[17] ( Liu Yi, Feng Zi’en, Wan Xiaoxian. An Emergency Question and Answering System Based on Knowledge Graph[J]. Computer & Telecommunication, 2020(4):51-55.)
[18] 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J]. 计算机研究与发展, 2016,53(3):582-600.
[18] ( Liu Qiao, Li Yang, Duan Hong, et al. Knowledge Graph Construction Techniques[J]. Journal of Computer Research and Development, 2016,53(3):582-600.)
[19] 王晰巍, 贾若男, 韦雅楠, 等. 社交网络舆情事件主题图谱构建及可视化研究——以校园突发事件话题为例[J]. 情报理论与实践, 2020,43(3):17-23.
[19] ( Wang Xiwei, Jia Ruonan, Wei Ya’nan, et al. Research on Construction and Visualization of Social Network Public Opinion Event Theme Map: A Case Study of Public Opinion on Campus Emergencies[J]. Information Studies: Theory & Application, 2020,43(3):17-23.)
[20] 刘宏哲, 须德. 基于本体的语义相似度和相关度计算研究综述[J]. 计算机科学, 2012,39(2):8-13.
[20] ( Liu Hongzhe, Xu De. Ontology Based Semantic Similarity and Relatedness Measures Review[J]. Computer Science, 2012,39(2):8-13.)
[21] 安璐, 梁艳平. 突发公共卫生事件微博话题与用户行为选择研究[J]. 数据分析与知识发现, 2019,3(4):33-41.
[21] ( An Lu, Liang Yanping. Selection of Users’ Behaviors Towards Different Topics of Microblog on Public Health Emergencies[J]. Data Analysis and Knowledge Discovery, 2019,3(4):33-41.)
[22] Zhang S, Zhang X, Wang H , et al. Multi-Scale Attentive Interaction Networks for Chinese Medical Question Answer Selection[J]. IEEE Access, 2018(6):74061-74071.
[1] Zhou Yang,Li Xuejun,Wang Donglei,Chen Fang,Peng Lijuan. Visualizing Knowledge Graph for Explosive Formula Design[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[2] Shen Kejie, Huang Huanting, Hua Bolin. Constructing Knowledge Graph with Public Resumes[J]. 数据分析与知识发现, 2021, 5(7): 81-90.
[3] Ruan Xiaoyun,Liao Jianbin,Li Xiang,Yang Yang,Li Daifeng. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[4] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[5] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[7] Lv Huakui,Hong Liang,Ma Feicheng. Constructing Knowledge Graph for Financial Equities[J]. 数据分析与知识发现, 2020, 4(5): 27-37.
[8] Sun Xinrui,Meng Yu,Wang Wenle. Identifying Traffic Events from Weibo with Knowledge Graph and Target Detection[J]. 数据分析与知识发现, 2020, 4(12): 136-147.
[9] Zhu Chaoyu, Liu Lei. A Review of Medical Decision Supports Based on Knowledge Graph[J]. 数据分析与知识发现, 2020, 4(12): 26-32.
[10] Hu Zhengyin,Liu Leilei,Dai Bing,Qin Xiaochu. Discovering Subject Knowledge in Life and Medical Sciences with Knowledge Graph[J]. 数据分析与知识发现, 2020, 4(11): 1-14.
[11] Wang Yi,Shen Zhe,Yao Yifan,Cheng Ying. Domain-Specific Event Graph Construction Methods:A Review[J]. 数据分析与知识发现, 2020, 4(10): 1-13.
[12] Li Jiaquan,Li Baoan,You Xindong,Lü Xueqiang. Computing Similarity of Patent Terms Based on Knowledge Graph[J]. 数据分析与知识发现, 2020, 4(10): 104-112.
[13] Haici Yang,Jun Wang. Visualizing Knowledge Graph of Academic Inheritance in Song Dynasty[J]. 数据分析与知识发现, 2019, 3(6): 109-116.
[14] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[15] Jiying Hu,Jing Xie,Li Qian,Changlei Fu. Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph[J]. 数据分析与知识发现, 2019, 3(1): 55-62.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn