Please wait a minute...
Advanced Search
现代图书情报技术  2013, Vol. Issue (6): 30-35    DOI: 10.11925/infotech.1003-3513.2013.06.05
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
中心度指标对语义述谓网络概念抽取的比较分析——以疾病治疗学研究为例
张晗, 刘双梅
中国医科大学医学信息学系 沈阳 110001
Comparative Analysis of Centrality Indices in Extracting Concepts from Semantic Predication Network——Based on Disease Treatment Research
Zhang Han, Liu Shuangmei
Department of Medical Informatics, China Medical University, Shenyang 110001, China
全文: PDF(578 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 为比较4种节点中心度指标抽取语义述谓网络关键节点的效度,借助UMLS和SemRep构建生物医学文献的语义述谓网,借助节点概念的语义类型及概念间语义关系,定义与疾病治疗相关的语义搭配模式,并抽取出治疗相关语义述谓。分别利用点度中心度、中间中心度、接近中心度以及特征向量中心度对与疾病治疗有关的药物、治疗措施、发病部位及伴发疾病的关键节点进行抽取,并与专家所制定的人工标准进行比较。结果显示节点中心度与语义搭配模式相结合能够有效地抽取出用户所关注的关键节点,其中以点度中心度效果最佳(F-值为0.72),特征向量中心度稍次之(F-值为0.66)。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘双梅
张晗
关键词 信息抽取语义述谓网络语义搭配模式节点中心度    
Abstract:The aim of the study is to compare the validity of four node centrality indices in extracting crucial nodes from semantic predication network. Depending on Unified Medical Language System (UMLS) and SemRep, this paper first constructs a semantic predication network for biomedical literature, in which nodes represent UMLS concepts and edges semantic relations between nodes. Relying on the semantic type of the concepts and the semantic relations, schemas related to disease treatment are defined and used to extract disease treatment related predications. Then four centrality indices including degree centrality, betweenness centrality, closeness centrality and eigenvector centrality are used to extract crucial concepts related to four aspects of disease treatment (therapeutic drugs, therapeutic procedures, body location of the disease and disease comorbidities). The extracted concepts are compared to a reference standard produced by domain experts. The results show that centrality combined with semantic schema can effectively extract crucial nodes of the users interest. Among four centrality indices, degree centrality performs best (F-score is 0.72) and eigenvector centrality performs secondly best (F-score is 0.66).
Key wordsInformation extraction    Semantic predication network    Semantic schema    Node centrality
收稿日期: 2013-04-28     
:  TP391.1  
通讯作者: 张晗     E-mail: zhanghan@mail.cmu.edu.cn
引用本文:   
张晗, 刘双梅. 中心度指标对语义述谓网络概念抽取的比较分析——以疾病治疗学研究为例[J]. 现代图书情报技术, 2013, (6): 30-35.
Zhang Han, Liu Shuangmei. Comparative Analysis of Centrality Indices in Extracting Concepts from Semantic Predication Network——Based on Disease Treatment Research. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2013.06.05.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2013.06.05
[1] 赵辉,刘怀亮,范云杰. 复杂网络理论在中文文本特征选择中的应用研究[J]. 现代图书情报技术,2012(9):23-28.(Zhao Hui, Liu Huailiang, Fan Yunjie. Study on the Application of Complex Network Theory in Chinese Text Feature Selection[J].New Technology of Library and Information Service,2012(9):23-28.)
[2] Erkan G, Radev D R. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization[J]. Journal of Artificial Intelligence Research,2004,22(1):457-479.
[3] Zhang X, Cheng G,Qu Y Z. Ontology Summarization Based on RDF Sentence Graph[C].In: Proceedings of the 16th International Conference on World Wide Web. 2007:707-716.
[4] Unified Medical Language System (UMLS)[EB/OL].[2013-03-11]. http://www.nlm.nih.gov/research/umls/.
[5] Aronson A R, Lang F M. An Overview of MetaMap: Historical Perspective and Recent Advances [J]. Journal of the American Medical Informatics Association, 2010,17(3):229-236.
[6] Kilicoglu H, Fiszman M, Rodriguez A, et al. Semantic MEDLINE: A Web Application to Manage the Results of PubMed Searches[C].In: Proceedings of the 3rd International Symposium on Semantic Mining in Biomedicine. 2008:69-76.
[7] Fiszman M, Demner-Fushman D, Kilicoglu H, et al. Automatic Summarization of MEDLINE Citations for Evidence-based Medical Treatment: A Topic-oriented Evaluation[J]. Journal of Biomedical Informatics,2009,42(5):801-813.
[8] Workman E T, Hurdle J F. Dynamic Summarization of Bibliographic-based Data[J]. BMC Medical Informatics & Decision Making, 2011,11(6). doi:10.1186/1472-6947-11-6.
[9] 商玥,王鸿飞,杨志豪. 利用语义关系抽取生成生物医学文摘的算法[J]. 计算机科学与探索, 2011,5(11):1027-1035.(Shang Yue, Wang Hongfei, Yang Zhihao. Automatic Summarization Algorithm for Biomedical Literature Based on Semantic Relation Extraction[J]. Journal of Frontiers of Computer Science & Technology, 2011,5(11):1027-1035.)
[10] Zhang H, Fiszman M, Shin D, et al. Degree Centrality for Semantic Abstraction Summarization of Theraputic Studies[J]. Journal of Biomedical Informatics,2011,44(5):830-838.
[11] de Nooy W, Mrvar A, Batagelj V.Appendix 1: Getting Started with Pajek[A].//Exploratory Social Network Analysis with Pajek[M].New York:Cambridge University Press,2010.
[12] Freeman L C. Centrality in Social Networks: Conceptual Clarification[J]. Social Networks, 1979,1(3):215-239.
[13] 高小强,赵星,陶乃航. 网络中心度用于期刊引文评价的有效性研究[J]. 大学图书馆学报,2009,27(5):61-64.(Gao Xiaoqiang, Zhao Xing, Tao Naihang. Validity of Journals Citation Evaluation with Centrality Indexes of Networks[J].Journal of Academic Libraries, 2009,27(5):61-64.)
[14] McCray A T, Burgun A, Bodenreider O. Aggregating UMLS Semantic Types for Reducing Conceptual Complexity[J].Studies in Health Technology and Informatics,2001,84(1):216-220.
[1] 刘志强,都云程,施水才. 基于改进的隐马尔科夫模型的网页新闻关键信息抽取*[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[2] 牟冬梅,金姗,琚沅红. 基于文献数据的疾病与基因关联关系研究*[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[3] 段宇锋,黄思思. 中文植物物种多样性描述文本的信息抽取研究*[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[4] 刘伟, 王星, 宋培彦. 同义词抽取结果的噪音清洗方法研究[J]. 现代图书情报技术, 2015, 31(6): 64-70.
[5] 李湘东, 霍亚勇, 黄莉. 图书网页的自动识别及书目信息抽取研究[J]. 现代图书情报技术, 2014, 30(4): 71-77.
[6] 刘雅静, 王衍喜, 郝丹, 周津慧. 机构知识库支撑科研服务方法研究[J]. 现代图书情报技术, 2014, 30(3): 1-7.
[7] 翟东升, 张欣琦, 张杰, 康宁. 分布式专利信息抽取系统设计与构建[J]. 现代图书情报技术, 2013, 29(7/8): 114-121.
[8] 黄勋, 游宏梁, 于洋. 关系抽取技术研究综述[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[9] 何琳, 何娟, 沈耕宇, 杨波, 黄水清. 一种通过文本挖掘发现实时定量聚合酶链式反应实验内参基因的方法研究[J]. 现代图书情报技术, 2012, 28(7): 109-114.
[10] 高强, 游宏梁. 基于层叠模型的国防领域命名实体识别研究[J]. 现代图书情报技术, 2012, (11): 47-52.
[11] 王秀艳, 崔雷. 应用关键动词抽取生物医学实体间语义关系研究综述[J]. 现代图书情报技术, 2011, 27(9): 21-27.
[12] 周虹, 张蓓, 姜爱蓉, 张成昱. 馆藏书目信息自助短信推送服务的设计与实现[J]. 现代图书情报技术, 2011, 27(7/8): 127-131.
[13] 王志超, 翁楠, 王宇. 基于主题句相似度的标题党新闻鉴别技术研究[J]. 现代图书情报技术, 2011, (11): 48-53.
[14] 逯万辉, 马建霞. 基于条件随机场模型的复杂时间信息抽取研究[J]. 现代图书情报技术, 2011, 27(10): 29-33.
[15] 孙镇 王惠临. 命名实体识别研究进展综述[J]. 现代图书情报技术, 2010, 26(6): 42-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn