Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (12): 57-64    DOI: 10.11925/infotech.1003-3513.2015.12.09
Current Issue | Archive | Adv Search |
Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology
Fan Xuexue1, Wang Zhirong1, Xu Wu1, Liang Yin2, Ma Xiaohu3
1 Clinical Medical School, Xuzhou Medical College, Xuzhou 221004, China;
2 School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, China;
3 School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Export: BibTeX | EndNote (RIS)      

[Objective] Based on the comprehensive medical Ontologies, this paper proposes a new algorithm to enhance the precision of semantic similarity estimation of medical terminology. [Methods] On the basis of the hierarchy and semantic relationships of concepts of SNOMED CT and MeSH, the semantic parameters such as depth and distance are extracted. Then the depth factor and the distance factor are obtained weighted by the concept density, and the function of semantic similarity is thus established. [Results] The algorithm is applicable to both distinctive medical Ontologies, and the experimental results demonstrate that this algorithm has higher correlation coefficient with manual scoring versus conventional algorithms. [Limitations] This algorithm is subject to hierarchy of Ontologies. [Conclusions] The new algorithm benefits the enhanced precision of semantic similarity estimation of medical terminology.

Received: 28 May 2015      Published: 06 April 2016
:  TP391  

Cite this article:

Fan Xuexue, Wang Zhirong, Xu Wu, Liang Yin, Ma Xiaohu. Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology. New Technology of Library and Information Service, 2015, 31(12): 57-64.

URL:     OR

[1] Chen M Y, Chu H C, Chen Y M. Developing a Semantic-Enable Information Retrieval Mechanism [J]. Expert Systems with Application, 2010, 37(1): 322-340.
[2] Kimtani D K, Choudhury J, Chakrabarty A. Improvement in Word Sense Disambiguation by Introducing Enhancements in English WordNet Structure [J]. International Journal on Computer Science and Engineering, 2012, 4(7): 1366-1370.
[3] Leroy G, Rindflesch T C. Effects of Information and Machine Learning Algorithms on Word Sense Disambiguation with Small Datasets [J]. International Journal of Medical Informatics, 2005, 74(7-8): 573-585
[4] Cilibrasi R L, Vitanyi P M B. The Google Similarity Distance [J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 370-383.
[5] Stevenson M, Greenwood M A. A Semantic Approach to IE Pattern Introduction [C]. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 379-386.
[6] Asservatham S, Bennani Y. Semi-Structured Document Categorization with a Semantic Kernel [J]. Pattern Recognition, 2009, 42(9): 2067-2076.
[7] Batet M, Valls A, Gibert K. Improving Classical Clustering with Ontologies [C]. In: Proceedings of the 4th World Conference of the IASC, Yokohama, Japan. 2008: 137-146.
[8] Lu H M, Chen H, Zeng D, et al. Multilingual Chief Complaint Classification for Syndromic Surveillance: An Experiment with Chinese Chief Complaints [J]. International Journal of Medical Informatics, 2009, 78(5): 308-320.
[9] Papachristoudis G, Diplaris S, Mitkas P A.SoFoCles: Feature Filtering for Microarray Classification Based on Gene Ontology [J]. Journal of Biomedical Informatics, 2010, 43(1): 1-14.
[10] 盛秋艳. 一种基于本体的语义相似度计算方法[J]. 情报科学, 2012, 30(8): 1238-1241. (Sheng Qiuyan. Research on the Measuring of Semantic Similarity Based Ontology [J]. Information Scinece, 2012, 30(8): 1238-1241.)
[11] 刘宏哲, 须德. 基于本体的语义相似度和相关度计算研究综述[J]. 计算机科学, 2012, 39(2): 8-13. (Liu Hongzhe, Xu De. Ontology Based Semantic Similarity and Relatedness Measures Review [J]. Computer Science, 2012, 39(2): 8-13.)
[12] 秦春秀, 祝婷, 赵捧未, 等. 自然语言语义分析研究进展[J]. 图书情报工作, 2014, 58(22): 130-137. (Qin Chunxiu, Zhu Ting, Zhao Pengwei, et al. Research Review on Semantics Analysis of Natural Language [J]. Library and Information Service, 2014, 58(22): 130-137.)
[13] Landauer T K, Foltz P W, Laham D. An Introduction to Lantent Semantic Analysis [J]. Discourse Processess, 1998, 25(2-3): 259-284.
[14] 陈海燕. 基于搜索引擎的词汇语义相似度计算方法[J]. 计算机科学, 2015, 42(1): 261-267. (Chen Haiyan. Measuring Semantic Similarity Between Words Using Web Search Engines [J]. Computer Science, 2015, 42(2): 261-267.)
[15] 李赟. 基于中文维基百科的语义知识挖掘相关研究[D]. 北京: 北京邮电大学, 2009. (Li Yun. Mining Semantic Knowledge from Chinese Wikipidia [D]. Beijing: Beijing University of Posts and Telecommunications, 2009.)
[16] Lord P W, Stevens R D, Brass A, et al. Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation [J]. Bioinformatics, 2003, 19(10): 1275-1283.
[17] Resnik P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy [C]. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI95). 1995: 448-453.
[18] Lin D. An Information-Theoretic Definition of Similarity [C]. In: Proceedings of the 15th International Conference on Machine Learning (ICML98). 1998: 296-304.
[19] Jiang J J, Conrath D W. Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy [C]. In: Proceedings of the 10th International Conference on Research in Computational Linguistics. 1997: 19-33.
[20] Batet M, Sanchez D, Valls A. An Ontology-Based Measure to Compute Semantic Similarity in Biomedicine [J]. Journal of Biomedical Informatics, 2011, 44(1): 118-125.
[21] Sanchez D, Batet M. Semantic Similarity Estimation in the Biomedical Domain: An Ontology-Based Information- Theoretic Perspective [J]. Journal of Biomedical Informatics, 2011, 44(5): 749-759.
[22] 游彬, 严岳松, 孙英阁, 等. 基于HowNet的信息量计算语义相似度算法[J]. 计算机系统应用, 2013, 22(1): 129-133. (You Bin, Yan Yuesong, Sun Yingge, et al. Method of Information Content Evaluating Semantic Similarity on HowNet [J]. Computer Systems & Applications, 2013, 22(1): 129-133.)
[23] Rada R, Mili H, Bichnell E, et al. Development and Application of a Metric on Semantic Nets [J]. IEEE Transac­tions on Systems, Man and Cybernetics, 1989, 19(1): 17-30.
[24] Leacock C, Chodorw M. Combining Local Context and WordNet Similarity for Word Sense Identification [A]. // WordNet: An Electronic Lexical Database [M]. MIT Press, 1998: 265-283.
[25] Wu Z, Palmer M. Verb Semantics and Lexical Selection [C]. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics. Assiciation for Computational Liguistics, 1994: 133-138.
[26] Tversky A. Features of Similarity [J]. Psychological Review, 1977, 84(4): 327-352.
[27] Patwardhan S, Pedersen T. Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts [C]. In: Proceedings of the EACL Workshop on Making Sense of Sense: Bringing Computaional Linguistics and Psycho­linguistics Together, Trento, Italy. 2006: 1-8.
[28] Banerjee S, Pedersen T. Extended Gloss Overlaps as a Measure of Semantic Relatedness [C]. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI). 2003: 805-810.
[29] Wan S, Angryk R A. Measuring Semantic Similarity Using Wordnet-Based Context Vectors [C]. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics. 2007: 908-913.
[30] Li Y, Bander Z A, Mclean D. An Approach for Measuring Semantic Similarity Between Words Using Multiple Information Sources [J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(4): 871-882.
[31] 吴健, 吴朝晖, 李莹, 等. 基于本体论和词汇语义相似度的Web服务发现[J]. 计算机学报, 2005, 28(4): 595-602. (Wu Jian, Wu Zhaohui, Li Ying, et al. Web Service Discovery Based on Ontology and Similarity of Words [J]. Chinese Journal of Computers, 2005, 28(4): 595-602.)
[32] Pedersen T, Pakhomov S, Patwardhan S, et al. Measures of Semantic Similarity and Relatedness in the Biomedical Domain [J]. Journal of Biomedical Informatics, 2007, 40(3): 288-299.
[33] Hliaoutakis A, Varelas G, Voutsakis E, et al. Information Retrieval by Semantic Similarity [J]. International Journal on Semantic Web and Information Systems, 2006, 2(3): 55-73.
[34] Al-Mubaid H, Nguyen H A. A Cluster-Based Approach for Semantic Similarity in the Biomedical Domain [C]. In: Proceedings of the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. New York: IEEE Computer Society, 2006: 2713-2717.
[35] 李文庆, 谢红薇. 基于医疗本体的语义相似度评估方法[J]. 计算机工程与设计, 2013, 34(4): 1287-1291. (Li Wenqing, Xie Hongwei. Semantic Similarity Estimation Method Based on Medical Ontology [J]. Computer Engineering and Design, 2013, 34(4): 1287-1291.)
[36] 孙海霞, 钱庆, 吴英杰, 等. MeSH词表的语义计相似度计算研究[J]. 现代图书情报技术, 2010(6): 12-16. (Sun Haixia, Qian Qing, Wu Yingjie, et al. Research on Semantic Similarity Measuring of MeSH [J]. New Technology of Library and Information Service, 2010(6): 12-16.)

[1] Wang Hong, Shu Zhan, Gao Yinquan, Tian Wenhong. Analyzing Implicit Discourse Relation with Single Classifier and Multi-Task Network[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2] Wu Yanwen, Cai Qiuting, Liu Zhi, Deng Yunze. Digital Resource Recommendation Based on Multi-Source Data and Scene Similarity Calculation[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3] Li Zhenyu, Li Shuqing. Deep Collaborative Filtering Algorithm with Embedding Implicit Similarity Groups[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4] Dong Miao, Su Zhongqi, Zhou Xiaobei, Lan Xue, Cui Zhigang, Cui Lei. Improving PubMedBERT for CID-Entity-Relation Classification Using Text-CNN[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] Hua Bin, Wu Nuo, He Xin. Integrating Expert Reviews for Government Information Projects with Knowledge Fusion[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8] Wang Yuan, Shi Kaize, Niu Zhendong. Position-Aware Stepwise Tagging Method for Triples Extraction of Entity-Relationship[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10] Dai Zhihong, Hao Xiaoling. Extracting Hypernym-Hyponym Relationship for Financial Market Applications[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11] Wang Xuefeng, Ren Huichao, Liu Yuqin. Research on the Visualization Method of Drawing Technology Theme Map with Clusters [J]. 数据分析与知识发现, 0, (): 1-.
[12] Wang Yifan,Li Bo,Shi Hua,Miao Wei,Jiang Bin. Annotation Method for Extracting Entity Relationship from Ancient Chinese Works[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14] Zhou Yang,Li Xuejun,Wang Donglei,Chen Fang,Peng Lijuan. Visualizing Knowledge Graph for Explosive Formula Design[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938