Please wait a minute...
New Technology of Library and Information Service  2014, Vol. 30 Issue (5): 26-32    DOI: 10.11925/infotech.1003-3513.2014.05.04
KNOWLEDGE ORGANIZATION AND KNOWLEDGE MANAGEMENT Current Issue | Archive | Adv Search |
Research on Automatic Algorithm of Finding English Synonymous Relations for Knowledge Organization System Integration
Li Xiaoying, Li Danya, Qian Qing, Sun Haixia, Li Junlian, Hu Tiejun
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download: PDF(510 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] In order to find synonymous relations for knowledge organization system integration. [Methods] This paper presents an automatic algorithm, which consists of lemmatization and semantic merging, as well as various methods to control the effects induced by vocabulary granularity. [Results] Its efficiency and effectiveness is well demonstrated from large scale data testing using many source vocabularies, compared with well-known integrated knowledge organization system. [Conclusions] The proposed algorithm can be used in large scale knowledge organization system integration, and is helpful for Chinese knowledge organization system integration.

Key wordsKnowledge organization system integration      Finding synonymous relations      Lemmatization      Semantic merging      Granularity     
Received: 02 January 2014      Published: 06 June 2014
:  G250  

Cite this article:

Li Xiaoying, Li Danya, Qian Qing, Sun Haixia, Li Junlian, Hu Tiejun. Research on Automatic Algorithm of Finding English Synonymous Relations for Knowledge Organization System Integration. New Technology of Library and Information Service, 2014, 30(5): 26-32.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2014.05.04     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2014/V30/I5/26

[1] 陆勇. 面向信息检索的汉语同义词自动识别[M]. 南京:东南大学出版社, 2009:14-17. (Lu Yong. Automatic Recogni-tion of Chinese Synonyms for Information Retrieval [M]. Nanjing: Southeast University Press, 2009: 14-17.)
[2] Doan A, Madhavan J, Domingos P, et al. Learning to Map between Ontologies on the Semantic Web [C]. In: Proceedings of the 11th International Conference on World Wide Web (WWW'02), Hawaii, USA. New York: ACM, 2002:662-673.
[3] Stoilos G, Stamou G, Kollias S. A String Metric for Ontology Alignment[C]. In: Proceedings of the 4th International Conference on the Semantic Web (ISWC'05). Berlin, Heidelberg: Springer-Verlag, 2005:624-637.
[4] Ehrig M, Staab S. QOM - Quick Ontology Mapping [C]. In: Proceedings of the 3rd International Semantic Web Conference(ISWC'04), Hiroshima, Japan. 2004:683-697.
[5] Huang K, Geller J, Halper M, et al. Using WordNet Synonym Substitution to Enhance UMLS Source Integration[J]. Artificial Intelligence in Medicine, 2009, 46 (2): 97-109.
[6] Mougin F, Burgun A, Bodenreider O. Using WordNet to Improve the Mapping of Data Elements to UMLS for Data Sources Integration[C]. In: Proceedings of AMIA Annual Symposium, 2006: 574-578.
[7] National Library of Medicine. MeSH Browser [EB/OL]. [2013-09-10]. http://www.nlm.nih.gov/mesh/MBrowser.html.
[8] U.S.National Library of Medicine.SNOMED Clinical Terms [EB/OL]. [2012-05-12]. http://www.nlm.nih.gov/research/umls/ Snomed/snomed_main.html.
[9] 吴思竹, 钱庆, 胡铁军, 等. 词干提取方法及工具的对比分析研究[J]. 图书情报工作, 2012, 56(15): 109-115, 142. (Wu Sizhu, Qian Qing, Hu Tiejun, et al. Comparative Analysis of Methods and Tools for Word Stemming[J]. Library and Information Service, 2012, 56(15): 109-115, 142.)
[10] 李晓瑛, 李丹亚, 胡铁军. 基于UMLS专家词典与工具的词形归并算法研究[J]. 情报科学, 2013, 31(4): 134-138. (Li Xiaoying, Li Danya, Hu Tiejun. Investigation of Algorithm for Lemmatisation Based on UMLS Specialist Lexicon and Lexical Tools[J]. Information Science, 2013, 31(4): 134-138.)
[11] 吴思竹, 钱庆, 胡铁军, 等. 词形还原方法及实现工具比较分析[J]. 现代图书情报技术, 2012(3): 27-34. (Wu Sizhu, Qian Qing, Hu Tiejun, et al. Contrast Analysis of Methods and Tools for Lemmatization[J]. New Technology of Library and Information Service, 2012(3): 27-34.)
[12] 吴思竹, 钱庆, 李丹亚, 等. 三种词形还原工具对领域词汇的还原效果评估[J].情报理论与实践, 2013, 36(5): 111-115. (Wu Sizhu, Qian Qing, Li Danya, et al. Evaluation the Effects of 3 Lemmatization Tools on the Field Specialized Vocabulary[J]. Information Studies: Theory & Application, 2013, 36(5): 111-115.)
[13] NUIT. MorphAdoner V 2.0[EB/OL]. [2013-08-07]. http:// morphadorner.northwestern.edu/morphadorner/.
[14] The Stanford Natural Language Processing Group. Stanford CoreNLP[EB/OL].[2013-11-12]. http://nlp.stanford.edu/softw are/corenlp.shtml.
[15] The Lexical Systems Group. Specialist NLP Tools [EB/OL]. [2013-10-17]. http://specialist.nlm.nih.gov/.
[16] The Lexical Systems Group. Specialist Lexicon Growth- Statistics [EB/OL]. [2013-12-10]. http://lexsrv3.nlm.nih.gov/ LexSysGroup/Projects/lexicon/current/docs/designDoc/UDF/statistics/index.html.
[17] Unified Medical Language System.The Norm Program [EB/OL]. [2013-04-09]. http://www.nlm.nih.gov/research/umls/ new_users/online_learning/LEX_005.html.
[18] 李晓瑛, 李丹亚, 钱庆, 等. 面向医学领域知识组织系统整合的缩略语构成方式及歧义性鉴别研究[J]. 医学信息学杂志, 2013, 34(10): 43-46. (Li Xiaoying, Li Danya, Qian Qing, et al. Research on Abbreviation Composition Form and Ambiguity Identification for Medical Knowledge Organiza-tion System Integration [J]. Journal of Medical Informatics, 2013, 34(10): 43-46.)
[19] U.S. National Library of Medicine.MedlinePlus[EB/OL]. [2012-10-20]. http://www.nlm.nih.gov/medlineplus/healthtopics. html.
[20] The Digital Anatomist Information System[EB/OL]. [2014-01-04]. http://sig.biostr.washington.edu/projects/da/.
[21] U.S.National Library of Medicine.Unified Medical Language System [EB/OL]. [2013-11-21]. http://www.nlm.nih.gov/research/ umls/.
[22] Fung K W, Hole W T, Nelson S J, et al. Integrating SNOMED CT into the UMLS: An Exploration of Different Views of Synonymy and Quality of Editing [J]. Journal of the American Medical Informatics Association, 2005, 12(4): 486-494.
[23] University of Utah. Consumer Health Vocabulary Initiative [EB/OL].[2014-01-04]. http://consumerhealthvocab.org/.

[1] Li Xiangdong, Ba Zhichao, Huang Li. Allocation and Multi-granularity[J]. 现代图书情报技术, 2015, 31(5): 42-49.
[2] Song Meiqing. Research on Multi-granularity Users' Preference Mining Based on Collaborative Filtering Personalized Recommendation[J]. 现代图书情报技术, 2015, 31(12): 28-33.
[3] Zhao Jie, Mo Zan, Liu Hongwei, Zhang Shaqing, Dong Zhenning. Web Usage Mining Using Reduction of Knowledge Granule[J]. 现代图书情报技术, 2013, 29(2): 50-56.
[4] Shi Chongde, Wang Huilin. Research on Chinese Word Segmentation Optimization in Statistical Machine Translation[J]. 现代图书情报技术, 2012, 28(4): 29-34.
[5] Wu Sizhu, Qian Qing, Hu Tiejun, Li Danya, Li Junlian, Hong Na. Contrast Analysis of Methods and Tools for Lemmatization[J]. 现代图书情报技术, 2012, 28(3): 27-34.
[6] Teng Guangqing, Bi Qiang, Bao Yulai. An Analysis on Keywords of Literature Based on Granularity Concept Analysis ——A Case Study of Ontology[J]. 现代图书情报技术, 2011, 27(9): 1-6.
[7] Zhao Jie, Dong Zhenning, Zhang Shaqing, Xiao Nanfeng. A Collection Method for Multi-granularity Web Usage Data[J]. 现代图书情报技术, 2011, 27(2): 42-47.
[8] Guo Wenli,Zhang Xiaolin. Description of Ontology Modules Based on Granularity[J]. 现代图书情报技术, 2010, 26(2): 1-6.
[9] Zhang Zhijuan,Liu Xinwang. An Information Retrieval Model of SGML Format Documents Based on Multi-granularity 2-tuple Linguistic Approach[J]. 现代图书情报技术, 2007, 2(7): 27-31.
[10] Qi Aihua,Liu Youhua,Liu Yusong. The Characteristic and Application Mode of XML Encryption[J]. 现代图书情报技术, 2005, 21(5): 73-75.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn