Please wait a minute...
New Technology of Library and Information Service  2010, Vol. 26 Issue (7/8): 51-57    DOI: 10.11925/infotech.1003-3513.2010.07-08.10
article Current Issue | Archive | Adv Search |
Review on Scientific and Technical Term Semantic Similarity Measure Methods
Xu JianZhang ZhixiongXiao ZhuoDeng Zhaojun1
1(School of Information Management, Sun Yat-Sen University, Guangzhou 510275,China)
2(National Science Library, Chinese Academy of Sciences, Beijing 100190,China)
3(Sun Yat-Sen University Libraries, Guangzhou 510275,China)
Download: PDF(440 KB)   HTML  
Export: BibTeX | EndNote (RIS)      

Based on the analysis of recent related literatures and projects, the paper concludes the term semantic measure methods as follows: similarity measure methods based on corpus characters and similarity measure methods based on open knowledge resources. And then it reviews the integration methods of multi-measure methods. It also summarizes the applications of term semantic similarity measure methods on the area of Natural Language Process (NLP) and Knowledge Mining (KM). Finally, the future development of research on term similarity measure is discussed to help build more efficient term similarity calculation system.

Key wordsTerm semantic similarity      Similarity measure      Phrase similarity     
Received: 09 June 2010      Published: 19 September 2010


Corresponding Authors: Xu Jian     E-mail:
About author:: Xu Jian Zhang Zhixiong Xiao Zhuo Deng Zhaojun

Cite this article:

Xu Jian Zhang Zhixiong Xiao Zhuo Deng Zhaojun. Review on Scientific and Technical Term Semantic Similarity Measure Methods. New Technology of Library and Information Service, 2010, 26(7/8): 51-57.

URL:     OR

[1] Hindle D. Noun Classification from Predicate-argument Structures[EB/OL].[2010-02-15].
[2] Resnik P.Using Information Content to Evaluate Semantic Similarity in a Taxonomy[EB/OL].[2010-02-15].
[3] Semantic Similarity[EB/OL].[2010-02-08].
[4] Nenadic G,Spasic I,Ananiadou S.Automatic Discovery of Term Similarities Using Pattern Mining[EB/OL].[2010-02-15].
[5] Bourigault D, Jacquemin C. Term Extraction + Term Clustering: An Integrated Platform for Computer-aided Terminology[EB/OL].[2010-02-15].
[6] Ogren  P V, Cohen K B, Acquaah-Mensah G K, et al. The Compositional Structure of Gene Ontology Terms[EB/OL].[2010-03-11].
[7] Levenshtein V I. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals[J]. Soviet Physics Doklady, 1966,10(8):707-710.
[8]  Kelil A, Wang S, Jiang Q, et al. A General Measure of Similarity for Categorical Sequences[OL]. [2010-03-11].
[9]  Hearst M A. Automatic Acquisition of Hyponyms from Large Text Corpora[EB/OL]. [2010-02-15].
[10]  Nenadic G, Spasic I, Ananiadou S. Mining Term Similarities from Corpora[EB/OL]. [2010-02-15].
[11] Ding J, Berleant D, Nettleton D, et al. Mining Medline: Abstracts, Sentences, or Phrases? [J]. Pacific Symposium Biocomputing, 2002(7):326-337.
[12] Neshati M, Hassanabadi L S. Taxonomy Construction Using Compound Similarity Measure[EB/OL]. [2009-01-08].
[13]  Jing L, Ng M K, Huang J Z. Knowledge-based Vector Space Model for Text Clustering[OL]. [2010-06-26].
[14] Rada R, Mili H, Bicknell E, et al. Development and Application of a Metric on Semantic Nets[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(1):17-30.
[15]  Siemiński A. Using WordNet to Measure the Similarity of Link Texts[OL].[2010-06-26].
[16] Rahurkar M A,Roth D, Huang T S. Which “Apple” are You Talking About?[C] In:Proceedings of the 17th International Conference on World Wide Web,Beijing, China. New York, NY, USA:ACM, 2008:1197-1198.
[17] Matsuo Y, Sakaki T, Uchiyama K, et al. Graph-based Word Clustering Using a Web Search Engine[C]. In:Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing,Sydney,Australia. Morristown, NJ, USA: Association for Computational Linguistics, 2006:542-550.
[18] Chen H H, Lin M S, Wei Y C. Novel Association Measures Using Web Search with Double Checking[C]. In:Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics,Sydney,Australia. Morristown, NJ, USA: Association for Computational Linguistics, 2006:1009-1016.
[19] Bollegala D,Matsuo Y,Ishizuka M. Measuring Semantic Similarity Between Words Using Web Search Engines[C]. In:Proceedings of the 16th International Conference on World Wide Web,Banff, Alberta, Canada. New York, NY, USA:ACM, 2007:757-766.
[20] Iosif E, Potamianos A. Unsupervised Semantic Similarity Computation Between Terms Using Web Documents[OL].[2010-06-26].
[21] Ittoo A, Maruster L. Ensemble Similarity Measures for Clustering Terms[C]. In:Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering. 2009:315-319.
[22] Dong H, Hussain F K, Chang E. A Hybrid Concept Similarity Measure Model for Ontology Environment[OL]. [2010-06-26].
[23] Turney  P D. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[C].In: Proceedings of the 12th European Conference on Machine Learning. Berlin: Springer-Verlag, 2001:491-502.
[24]  Chen P,Lin S J. Automatic Keyword Prediction Using Google Similarity Distance[J]. Expert Systems with Applications, 2010, 37(3):1928-1938.
[25] Nenadic G, Ananiadou S. Mining Semantically Related Terms from Biomedical Literature[J]. ACM Transactions on Asian Language Information Processing, 2006, 5(1):22-43.
[26] Spasic I, Nenadic G, Ananiadou S. Learning to Classify Biomedical Terms Through Literature Mining and Genetic Algorithms[C].In:Proceedings of Intelligent Data Engineering and Automated Learning 2004. Exeter,UK: Springer-Verlag, 2004:345-351.
[27] Spasic I, Ananiadou S. A Flexible Measure of Contextual Similarity for Biomedical Terms[C]. In:Proceedings of the 10th Pacific Symposium on Biocomputing. Hawaii: CiteSeer, 2005:197-208.
[28] Shehata S. A WordNet-based Semantic Model for Enhancing Text Clustering[C/OL]. In:Proceedings of IEEE International Conference on Data Mining Workshops.2009:477-482. [2010-06-26].
[29] Song L,Ma J,Lei J, et al. Semantic Structural Similarity Measure for Clustering XML Documents[OL].[2010-06-26].
[30] Lau A, Tsui E, Lee W B. An Ontology-based Similarity Measurement for Problem-based Case Reasoning[J]. Expert Systems with Applications, 2009, 36(3):6574-6579.
[31] Peng H, Niu W, Huang R. Similarity Based Semantic Web Service Match[C].In:Proceedings of the International Conference on Web Information Systems and Mining,Shanghai, China. Berlin, Heidelberg:Springer-Verlag,2009:252-260. 

[1] Haixia Sun,Lei Wang,Yingjie Wu,Weina Hua,Junlian Li. Matching Strategies for Institution Names in Literature Database[J]. 数据分析与知识发现, 2018, 2(8): 88-97.
[2] Yong Wang,Yongdong Wang,Huifang Guo,Yumin Zhou. Measuring Item Similarity Based on Increment of Diversity[J]. 数据分析与知识发现, 2018, 2(5): 70-76.
[3] Jiang Shuhao, Xue Fuliang. An Improved Content-based Recommendation Method Through Collaborative Predictions and Fuzzy Similarity Measures[J]. 现代图书情报技术, 2014, 30(2): 41-47.
[4] Liu Ping, Chen Ye. Survey of the State of the Art in Word Similarity[J]. 现代图书情报技术, 2012, 28(7): 82-89.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938