Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (12): 57-64     https://doi.org/10.11925/infotech.1003-3513.2015.12.09
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于医学本体的术语相似度算法研究
范雪雪1, 王志荣1, 徐晤1, 梁银2, 马小虎3
1 徐州医学院临床学院 徐州 221004;
2 江苏师范大学计算机科学与技术学院 徐州 221116;
3 苏州大学计算机科学与技术学院 苏州 215006
Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology
Fan Xuexue1, Wang Zhirong1, Xu Wu1, Liang Yin2, Ma Xiaohu3
1 Clinical Medical School, Xuzhou Medical College, Xuzhou 221004, China;
2 School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, China;
3 School of Computer Science and Technology, Soochow University, Suzhou 215006, China
全文: PDF (462 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]借助大型的医学本体, 提升医学术语相似度计算精度。[方法]依据SNOMED CT和MeSH两个医学本体的层级结构和语义关系, 提取概念术语的深度、距离等语义参数, 并用概念密度对其加权得到深度系数和距离系数, 构造相似度函数进行术语相似度计算。[结果]该算法能在两个医学本体中进行术语相似度计算, 较传统算法更加接近人工评分标准。[局限]该方法较为依赖本体结构。[结论]该方法能够提高以医学本体为基础的术语相似度计算精确度。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] Based on the comprehensive medical Ontologies, this paper proposes a new algorithm to enhance the precision of semantic similarity estimation of medical terminology. [Methods] On the basis of the hierarchy and semantic relationships of concepts of SNOMED CT and MeSH, the semantic parameters such as depth and distance are extracted. Then the depth factor and the distance factor are obtained weighted by the concept density, and the function of semantic similarity is thus established. [Results] The algorithm is applicable to both distinctive medical Ontologies, and the experimental results demonstrate that this algorithm has higher correlation coefficient with manual scoring versus conventional algorithms. [Limitations] This algorithm is subject to hierarchy of Ontologies. [Conclusions] The new algorithm benefits the enhanced precision of semantic similarity estimation of medical terminology.

收稿日期: 2015-05-28      出版日期: 2016-04-06
:  TP391  
  G35  
基金资助:

本文系江苏省现代教育技术研究课题“智能无纸化医学考试系统的开发”(项目编号:19696)和徐州医学院科研课题“基于SNOMED CT的医学术语相似度计算研究”(项目编号:2014KJ31)的研究成果之一。

通讯作者: 范雪雪, ORCID: 0000-0002-0450-480X, E-mail: xuexuefx@126.com。     E-mail: xuexuefx@126.com
作者简介: 作者贡献声明:范雪雪: 提出研究思路, 设计并实现算法, 撰写论文; 王志荣, 徐晤: 提供实验数据, 进行数据分析; 梁银: 数据分析, 论文修订; 马小虎: 论文修订。
引用本文:   
范雪雪, 王志荣, 徐晤, 梁银, 马小虎. 基于医学本体的术语相似度算法研究[J]. 现代图书情报技术, 2015, 31(12): 57-64.
Fan Xuexue, Wang Zhirong, Xu Wu, Liang Yin, Ma Xiaohu. Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology. New Technology of Library and Information Service, 2015, 31(12): 57-64.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.12.09      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2015/V31/I12/57

[1] Chen M Y, Chu H C, Chen Y M. Developing a Semantic-Enable Information Retrieval Mechanism [J]. Expert Systems with Application, 2010, 37(1): 322-340.
[2] Kimtani D K, Choudhury J, Chakrabarty A. Improvement in Word Sense Disambiguation by Introducing Enhancements in English WordNet Structure [J]. International Journal on Computer Science and Engineering, 2012, 4(7): 1366-1370.
[3] Leroy G, Rindflesch T C. Effects of Information and Machine Learning Algorithms on Word Sense Disambiguation with Small Datasets [J]. International Journal of Medical Informatics, 2005, 74(7-8): 573-585
[4] Cilibrasi R L, Vitanyi P M B. The Google Similarity Distance [J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 370-383.
[5] Stevenson M, Greenwood M A. A Semantic Approach to IE Pattern Introduction [C]. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 379-386.
[6] Asservatham S, Bennani Y. Semi-Structured Document Categorization with a Semantic Kernel [J]. Pattern Recognition, 2009, 42(9): 2067-2076.
[7] Batet M, Valls A, Gibert K. Improving Classical Clustering with Ontologies [C]. In: Proceedings of the 4th World Conference of the IASC, Yokohama, Japan. 2008: 137-146.
[8] Lu H M, Chen H, Zeng D, et al. Multilingual Chief Complaint Classification for Syndromic Surveillance: An Experiment with Chinese Chief Complaints [J]. International Journal of Medical Informatics, 2009, 78(5): 308-320.
[9] Papachristoudis G, Diplaris S, Mitkas P A.SoFoCles: Feature Filtering for Microarray Classification Based on Gene Ontology [J]. Journal of Biomedical Informatics, 2010, 43(1): 1-14.
[10] 盛秋艳. 一种基于本体的语义相似度计算方法[J]. 情报科学, 2012, 30(8): 1238-1241. (Sheng Qiuyan. Research on the Measuring of Semantic Similarity Based Ontology [J]. Information Scinece, 2012, 30(8): 1238-1241.)
[11] 刘宏哲, 须德. 基于本体的语义相似度和相关度计算研究综述[J]. 计算机科学, 2012, 39(2): 8-13. (Liu Hongzhe, Xu De. Ontology Based Semantic Similarity and Relatedness Measures Review [J]. Computer Science, 2012, 39(2): 8-13.)
[12] 秦春秀, 祝婷, 赵捧未, 等. 自然语言语义分析研究进展[J]. 图书情报工作, 2014, 58(22): 130-137. (Qin Chunxiu, Zhu Ting, Zhao Pengwei, et al. Research Review on Semantics Analysis of Natural Language [J]. Library and Information Service, 2014, 58(22): 130-137.)
[13] Landauer T K, Foltz P W, Laham D. An Introduction to Lantent Semantic Analysis [J]. Discourse Processess, 1998, 25(2-3): 259-284.
[14] 陈海燕. 基于搜索引擎的词汇语义相似度计算方法[J]. 计算机科学, 2015, 42(1): 261-267. (Chen Haiyan. Measuring Semantic Similarity Between Words Using Web Search Engines [J]. Computer Science, 2015, 42(2): 261-267.)
[15] 李赟. 基于中文维基百科的语义知识挖掘相关研究[D]. 北京: 北京邮电大学, 2009. (Li Yun. Mining Semantic Knowledge from Chinese Wikipidia [D]. Beijing: Beijing University of Posts and Telecommunications, 2009.)
[16] Lord P W, Stevens R D, Brass A, et al. Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation [J]. Bioinformatics, 2003, 19(10): 1275-1283.
[17] Resnik P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy [C]. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI95). 1995: 448-453.
[18] Lin D. An Information-Theoretic Definition of Similarity [C]. In: Proceedings of the 15th International Conference on Machine Learning (ICML98). 1998: 296-304.
[19] Jiang J J, Conrath D W. Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy [C]. In: Proceedings of the 10th International Conference on Research in Computational Linguistics. 1997: 19-33.
[20] Batet M, Sanchez D, Valls A. An Ontology-Based Measure to Compute Semantic Similarity in Biomedicine [J]. Journal of Biomedical Informatics, 2011, 44(1): 118-125.
[21] Sanchez D, Batet M. Semantic Similarity Estimation in the Biomedical Domain: An Ontology-Based Information- Theoretic Perspective [J]. Journal of Biomedical Informatics, 2011, 44(5): 749-759.
[22] 游彬, 严岳松, 孙英阁, 等. 基于HowNet的信息量计算语义相似度算法[J]. 计算机系统应用, 2013, 22(1): 129-133. (You Bin, Yan Yuesong, Sun Yingge, et al. Method of Information Content Evaluating Semantic Similarity on HowNet [J]. Computer Systems & Applications, 2013, 22(1): 129-133.)
[23] Rada R, Mili H, Bichnell E, et al. Development and Application of a Metric on Semantic Nets [J]. IEEE Transac­tions on Systems, Man and Cybernetics, 1989, 19(1): 17-30.
[24] Leacock C, Chodorw M. Combining Local Context and WordNet Similarity for Word Sense Identification [A]. // WordNet: An Electronic Lexical Database [M]. MIT Press, 1998: 265-283.
[25] Wu Z, Palmer M. Verb Semantics and Lexical Selection [C]. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics. Assiciation for Computational Liguistics, 1994: 133-138.
[26] Tversky A. Features of Similarity [J]. Psychological Review, 1977, 84(4): 327-352.
[27] Patwardhan S, Pedersen T. Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts [C]. In: Proceedings of the EACL Workshop on Making Sense of Sense: Bringing Computaional Linguistics and Psycho­linguistics Together, Trento, Italy. 2006: 1-8.
[28] Banerjee S, Pedersen T. Extended Gloss Overlaps as a Measure of Semantic Relatedness [C]. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI). 2003: 805-810.
[29] Wan S, Angryk R A. Measuring Semantic Similarity Using Wordnet-Based Context Vectors [C]. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics. 2007: 908-913.
[30] Li Y, Bander Z A, Mclean D. An Approach for Measuring Semantic Similarity Between Words Using Multiple Information Sources [J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(4): 871-882.
[31] 吴健, 吴朝晖, 李莹, 等. 基于本体论和词汇语义相似度的Web服务发现[J]. 计算机学报, 2005, 28(4): 595-602. (Wu Jian, Wu Zhaohui, Li Ying, et al. Web Service Discovery Based on Ontology and Similarity of Words [J]. Chinese Journal of Computers, 2005, 28(4): 595-602.)
[32] Pedersen T, Pakhomov S, Patwardhan S, et al. Measures of Semantic Similarity and Relatedness in the Biomedical Domain [J]. Journal of Biomedical Informatics, 2007, 40(3): 288-299.
[33] Hliaoutakis A, Varelas G, Voutsakis E, et al. Information Retrieval by Semantic Similarity [J]. International Journal on Semantic Web and Information Systems, 2006, 2(3): 55-73.
[34] Al-Mubaid H, Nguyen H A. A Cluster-Based Approach for Semantic Similarity in the Biomedical Domain [C]. In: Proceedings of the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. New York: IEEE Computer Society, 2006: 2713-2717.
[35] 李文庆, 谢红薇. 基于医疗本体的语义相似度评估方法[J]. 计算机工程与设计, 2013, 34(4): 1287-1291. (Li Wenqing, Xie Hongwei. Semantic Similarity Estimation Method Based on Medical Ontology [J]. Computer Engineering and Design, 2013, 34(4): 1287-1291.)
[36] 孙海霞, 钱庆, 吴英杰, 等. MeSH词表的语义计相似度计算研究[J]. 现代图书情报技术, 2010(6): 12-16. (Sun Haixia, Qian Qing, Wu Yingjie, et al. Research on Semantic Similarity Measuring of MeSH [J]. New Technology of Library and Information Service, 2010(6): 12-16.)

[1] 王鸿, 舒展, 高印权, 田文洪. 一种单分类器联合多任务网络的隐式句间关系分析方法*[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2] 吴彦文, 蔡秋亭, 刘智, 邓云泽. 融合多源数据和场景相似度计算的数字资源推荐研究*[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3] 李振宇, 李树青. 嵌入隐式相似群的深度协同过滤算法*[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4] 董淼, 苏中琪, 周晓北, 兰雪, 崔志刚, 崔雷. 利用Text-CNN改进PubMedBERT在化学诱导性疾病实体关系分类效果的尝试[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] 丁浩, 艾文华, 胡广伟, 李树青, 索炜. 融合用户兴趣波动时序的个性化推荐模型*[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] 华斌, 吴诺, 贺欣. 基于知识融合的政务信息化项目多专家审批意见整合*[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8] 王媛, 时恺泽, 牛振东. 一种用于实体关系三元组抽取的位置辅助分步标记方法*[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9] 杨辰, 陈晓虹, 王楚涵, 刘婷婷. 基于用户细粒度属性偏好聚类的推荐策略*[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10] 戴志宏, 郝晓玲. 上下位关系抽取方法及其在金融市场的应用*[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11] 汪雪锋, 任惠超, 刘玉琴. 融合聚类信息的技术主题图可视化方法研究 [J]. 数据分析与知识发现, 0, (): 1-.
[12] 王一钒,李博,史话,苗威,姜斌. 古汉语实体关系联合抽取的标注方法*[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13] 车宏鑫,王桐,王伟. 前列腺癌预测模型对比研究*[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14] 周阳,李学俊,王冬磊,陈方,彭莉娟. 炸药配方设计知识图谱的构建与可视分析方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15] 马江微, 吕学强, 游新冬, 肖刚, 韩君妹. 融合BERT与关系位置特征的军事领域关系抽取方法*[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn