Please wait a minute...
Advanced Search
现代图书情报技术  2010, Vol. 26 Issue (7/8): 51-57    DOI: 10.11925/infotech.1003-3513.2010.07-08.10
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
科技术语语义相似度计算方法研究综述
徐健张智雄肖卓邓昭俊1
1(中山大学资讯管理系 广州 510275)
2(中国科学院国家科学图书馆 北京 100190)
3(中山大学图书馆 广州 510275)
Review on Scientific and Technical Term Semantic Similarity Measure Methods
Xu JianZhang ZhixiongXiao ZhuoDeng Zhaojun1
1(School of Information Management, Sun Yat-Sen University, Guangzhou 510275,China)
2(National Science Library, Chinese Academy of Sciences, Beijing 100190,China)
3(Sun Yat-Sen University Libraries, Guangzhou 510275,China)
全文: PDF(440 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

在对当前术语语义相似度计算进行分析研究的基础上,将科技术语相似度计算归纳为基于语料文集的相似度计算和基于开放知识资源的相似度计算,对相似度指标的集成算法进行综述。并对科技术语语义相似度计算在自然语言处理和知识挖掘方面的应用进行总结,对其未来研究发展进行展望,为进一步构建高效的术语相似度计算系统提供良好借鉴。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐健
张智雄
肖卓
邓昭俊
关键词 术语语义相似度相似度计算语词相似度    
Abstract

Based on the analysis of recent related literatures and projects, the paper concludes the term semantic measure methods as follows: similarity measure methods based on corpus characters and similarity measure methods based on open knowledge resources. And then it reviews the integration methods of multi-measure methods. It also summarizes the applications of term semantic similarity measure methods on the area of Natural Language Process (NLP) and Knowledge Mining (KM). Finally, the future development of research on term similarity measure is discussed to help build more efficient term similarity calculation system.

Key wordsTerm semantic similarity    Similarity measure    Phrase similarity
收稿日期: 2010-06-09     
: 

G250.73

 
基金资助:

本文系教育部人文社会科学研究项目基金资助课题“从科技文献中挖掘术语相似性及其在知识发现中的应用”(项目编号:09YJC870031)的研究成果之一。

通讯作者: 徐健     E-mail: issxj@mail.sysu.edu.cn
作者简介: 徐健 张智雄 肖卓 邓昭俊
引用本文:   
徐健 张智雄 肖卓 邓昭俊. 科技术语语义相似度计算方法研究综述[J]. 现代图书情报技术, 2010, 26(7/8): 51-57.
Xu Jian Zhang Zhixiong Xiao Zhuo Deng Zhaojun. Review on Scientific and Technical Term Semantic Similarity Measure Methods. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2010.07-08.10.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2010.07-08.10

[1] Hindle D. Noun Classification from Predicate-argument Structures[EB/OL].[2010-02-15]. http://portal.acm.org/citation.cfm?doid=981823.981857.
[2] Resnik P.Using Information Content to Evaluate Semantic Similarity in a Taxonomy[EB/OL].[2010-02-15]. http://lsdis.cs.uga.edu/~ravi/academic/ATIS/SemanticSimilarity.pdf.
[3] Semantic Similarity[EB/OL].[2010-02-08]. http://en.wikipedia.org/wiki/Semantic_similarity.
[4] Nenadic G,Spasic I,Ananiadou S.Automatic Discovery of Term Similarities Using Pattern Mining[EB/OL].[2010-02-15]. http://portal.acm.org/citation.cfm?id=1118771.1118779.
[5] Bourigault D, Jacquemin C. Term Extraction + Term Clustering: An Integrated Platform for Computer-aided Terminology[EB/OL].[2010-02-15]. http://www.citeulike.org/group/6967/article/3390979.
[6] Ogren  P V, Cohen K B, Acquaah-Mensah G K, et al. The Compositional Structure of Gene Ontology Terms[EB/OL].[2010-03-11]. http://www.citeulike.org/user/leechuck/article/623609.
[7] Levenshtein V I. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals[J]. Soviet Physics Doklady, 1966,10(8):707-710.
[8]  Kelil A, Wang S, Jiang Q, et al. A General Measure of Similarity for Categorical Sequences[OL]. [2010-03-11]. http://www.springerlink.com/content/w0758nk547116566/.
[9]  Hearst M A. Automatic Acquisition of Hyponyms from Large Text Corpora[EB/OL]. [2010-02-15].http://portal.acm.org/citation.cfm?id=992154.
[10]  Nenadic G, Spasic I, Ananiadou S. Mining Term Similarities from Corpora[EB/OL]. [2010-02-15].http://cat.inist.fr/?aModele=afficheN&cpsidt=15859453.
[11] Ding J, Berleant D, Nettleton D, et al. Mining Medline: Abstracts, Sentences, or Phrases? [J]. Pacific Symposium Biocomputing, 2002(7):326-337.
[12] Neshati M, Hassanabadi L S. Taxonomy Construction Using Compound Similarity Measure[EB/OL]. [2009-01-08]. http://www.springerlink.com/index/t2244258v8k47705.pdf.
[13]  Jing L, Ng M K, Huang J Z. Knowledge-based Vector Space Model for Text Clustering[OL]. [2010-06-26]. http://www.springerlink.com/content/m178072619111181/.
[14] Rada R, Mili H, Bicknell E, et al. Development and Application of a Metric on Semantic Nets[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(1):17-30.
[15]  Siemiński A. Using WordNet to Measure the Similarity of Link Texts[OL].[2010-06-26]. http://www.springerlink.com/content/984544r21372586t/.
[16] Rahurkar M A,Roth D, Huang T S. Which “Apple” are You Talking About?[C] In:Proceedings of the 17th International Conference on World Wide Web,Beijing, China. New York, NY, USA:ACM, 2008:1197-1198.
[17] Matsuo Y, Sakaki T, Uchiyama K, et al. Graph-based Word Clustering Using a Web Search Engine[C]. In:Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing,Sydney,Australia. Morristown, NJ, USA: Association for Computational Linguistics, 2006:542-550.
[18] Chen H H, Lin M S, Wei Y C. Novel Association Measures Using Web Search with Double Checking[C]. In:Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics,Sydney,Australia. Morristown, NJ, USA: Association for Computational Linguistics, 2006:1009-1016.
[19] Bollegala D,Matsuo Y,Ishizuka M. Measuring Semantic Similarity Between Words Using Web Search Engines[C]. In:Proceedings of the 16th International Conference on World Wide Web,Banff, Alberta, Canada. New York, NY, USA:ACM, 2007:757-766.
[20] Iosif E, Potamianos A. Unsupervised Semantic Similarity Computation Between Terms Using Web Documents[OL].[2010-06-26].http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.193.
[21] Ittoo A, Maruster L. Ensemble Similarity Measures for Clustering Terms[C]. In:Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering. 2009:315-319.
[22] Dong H, Hussain F K, Chang E. A Hybrid Concept Similarity Measure Model for Ontology Environment[OL]. [2010-06-26]. http://www.springerlink.com/content/l348v125l06r074q/.
[23] Turney  P D. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[C].In: Proceedings of the 12th European Conference on Machine Learning. Berlin: Springer-Verlag, 2001:491-502.
[24]  Chen P,Lin S J. Automatic Keyword Prediction Using Google Similarity Distance[J]. Expert Systems with Applications, 2010, 37(3):1928-1938.
[25] Nenadic G, Ananiadou S. Mining Semantically Related Terms from Biomedical Literature[J]. ACM Transactions on Asian Language Information Processing, 2006, 5(1):22-43.
[26] Spasic I, Nenadic G, Ananiadou S. Learning to Classify Biomedical Terms Through Literature Mining and Genetic Algorithms[C].In:Proceedings of Intelligent Data Engineering and Automated Learning 2004. Exeter,UK: Springer-Verlag, 2004:345-351.
[27] Spasic I, Ananiadou S. A Flexible Measure of Contextual Similarity for Biomedical Terms[C]. In:Proceedings of the 10th Pacific Symposium on Biocomputing. Hawaii: CiteSeer, 2005:197-208.
[28] Shehata S. A WordNet-based Semantic Model for Enhancing Text Clustering[C/OL]. In:Proceedings of IEEE International Conference on Data Mining Workshops.2009:477-482. [2010-06-26]. http://www.computer.org/portal/web/csdl/doi/10.1109/ICDMW.2009.86.
[29] Song L,Ma J,Lei J, et al. Semantic Structural Similarity Measure for Clustering XML Documents[OL].[2010-06-26].http://www.springerlink.com/content/7278074245m71127/.
[30] Lau A, Tsui E, Lee W B. An Ontology-based Similarity Measurement for Problem-based Case Reasoning[J]. Expert Systems with Applications, 2009, 36(3):6574-6579.
[31] Peng H, Niu W, Huang R. Similarity Based Semantic Web Service Match[C].In:Proceedings of the International Conference on Web Information Systems and Mining,Shanghai, China. Berlin, Heidelberg:Springer-Verlag,2009:252-260. 

[1] 关鹏,王曰芬,傅柱. 基于LDA的主题语义演化分析方法研究 * ——以锂离子电池领域为例[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[2] 孙海霞,王蕾,吴英杰,华薇娜,李军莲. 科技文献数据库中机构名称匹配策略研究*[J]. 数据分析与知识发现, 2018, 2(8): 88-97.
[3] 刘萍, 陈烨. 词汇相似度研究进展综述[J]. 现代图书情报技术, 2012, 28(7): 82-89.
[4] 李文江, 陈诗琴. AIMLBot智能机器人在实时虚拟参考咨询中的应用[J]. 现代图书情报技术, 2012, 28(7): 127-132.
[5] 徐健. 基于句法依赖关系模板的术语相似度计算方法[J]. 现代图书情报技术, 2011, 27(9): 28-33.
[6] 董桂. 基于PostgreSQL的TMX数据存储研究与语料检索平台实现[J]. 现代图书情报技术, 2011, 27(7/8): 47-55.
[7] 王志超, 翁楠, 王宇. 基于主题句相似度的标题党新闻鉴别技术研究[J]. 现代图书情报技术, 2011, (11): 48-53.
[8] 王军辉, 胡铁军, 李丹亚. 相关文献检索研究综述[J]. 现代图书情报技术, 2011, 27(1): 39-45.
[9] 孙海霞 钱庆 吴英杰 李军莲. MeSH词表的语义相似度计算研究*[J]. 现代图书情报技术, 2010, 26(6): 12-16.
[10] 谢靖, 江岚, 王东波, 苏新宁. 基于万方数据(2003-2007)的知识发现应用研究[J]. 现代图书情报技术, 2010, 26(12): 64-69.
[11] 孙海霞,钱庆,成颖. 基于本体的语义相似度计算方法研究综述*[J]. 现代图书情报技术, 2010, 26(1): 51-56.
[12] 康小丽,章成志,王惠临. 基于可比语料库的双语术语抽取研究述评*[J]. 现代图书情报技术, 2009, (10): 7-13.
[13] 廉站俊,吕学强,张玉杰,施水才. 基于句子相似度计算的信息抽取*[J]. 现代图书情报技术, 2007, 2(6): 38-41.
[14] 宋琦,薛建武 . 智能检索中基于用户模型的本体映射方法研究[J]. 现代图书情报技术, 2006, 1(9): 29-33.
[15] 丁一. Web上基于特定主题的RG-HITS算法研究[J]. 现代图书情报技术, 2005, 21(6): 26-29.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn