Please wait a minute...
Advanced Search
现代图书情报技术  2011, Vol. 27 Issue (9): 28-33    DOI: 10.11925/infotech.1003-3513.2011.09.05
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
基于句法依赖关系模板的术语相似度计算方法
徐健
中山大学资讯管理学院 广州 510006
A Term Similarity Algorithm Based on Context Dependency Relation Pattern
Xu Jian
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
全文: PDF(452 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 针对现有基于语境特征的术语相似度算法在语境模板生成和匹配过程中存在的不足,提出基于术语的句法依赖关系自动构造术语语境模板,进而通过语境模板匹配计算术语相似度的方法。该方法既能减少语境模板的生成和匹配困难,又将术语语境特征较好地保留在模板中。针对新方法提出具体的实现步骤,并选取基因工程领域实验数据对新方法和现有典型方法进行对比评测。实验证明,新方法在计算效果方面具有明显提升。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐健
关键词 术语相似度语境相似度相似度计算    
Abstract:Based on the problems in typical term context similarity algorithm, the paper puts forward a new term similarity algorithm which constructs context patterns automatically by sentences dependencies analysis and then computes term similarity by mapping context patterns. The algorithm provides a better way to construct term context patterns. Meanwhile, term context characters are kept well in patterns. The paper also presents the specific implementation steps of new algorithm, and evaluates the algorithm on basis of gene engineering field experiment data set. Experiment result demonstrates that the algorithm has an obvious improvement in computing performance.
Key wordsTerm similarity    Context similarity    Similarity computation
收稿日期: 2011-07-22     
: 

G250.73

 
基金资助:

本文系教育部人文社会科学研究项目基金资助课题“从科技文献中挖掘术语相似性及其在知识发现中的应用”(项目编号:09YJC870031) 的研究成果之一。

引用本文:   
徐健. 基于句法依赖关系模板的术语相似度计算方法[J]. 现代图书情报技术, 2011, 27(9): 28-33.
Xu Jian. A Term Similarity Algorithm Based on Context Dependency Relation Pattern. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2011.09.05.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2011.09.05
[1] Chen P, Lin S. Automatic Keyword Prediction Using Google Similarity Distance[J]. Expert Systems with Applications, 2010, 37(3): 1928-1938.

[2] Shehata S. A WordNet-based Semantic Model for Enhancing Text Clustering[C]. In:Proceedings of the 2009 IEEE International Conference on Data Mining Workshops. 2009: 477-482.

[3] Aimé X, Furst F, Kuntz P, et al. SemioSem: A Semiotic-based Similarity Measure[C]. In: Proceedings of the Confederated International Workshops and Posters on the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009. 2009: 584-593.

[4] [JP3]Dong H, Hussain F K, Chang E. A Hybrid Concept Similarity Measure Model for Ontology Environment[C]. In: Proceedings of the Confederated International Workshops and Posters on the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009. 2009: 848-857.[JP]

[5] Neshati M, Hassanabadi L S. Taxonomy Construction Using Compound Similarity Measure[C]. In: Proceedings of the 2007 OTM Confederated International Conference on the Move to Meaningful Internet Systems: CoopIS, DOA, ODBASE, GADA, and IS. 2007:915-932.

[6] Hindle D. Noun Classification from Predicate-argument Structures[C]. In: Proceedings of the 28th Annual Meeting on Association for Computational Linguistics.1990:268-275.

[7] Church K W, Hanks P. Word Association Norms, Mutual Information, and Lexicography[C]. In: Proceedings of the 27th Annual Meeting of ACL, Vancouver. 1989: 76-83.

[8] [JP3]Nenadic G, Spasic I, Ananiadou S. Automatic Discovery of Term Similarities Using Pattern Mining[C]. In: Proceedings of the 2nd International Workshop on Computational Terminology. 2002: 43-49. [JP]

[9] Stanford Dependencies [EB/OL]. [2011-07-15]. http://nlp.stanford.edu/software/stanford-dependencies.shtml.

[10] 徐健. 过滤阶段需保留的依赖关系类型列表[EB/OL]. [2011-07-15]. http://blog.sina.com.cn/s/blog_64b661270100 u8iz.html.

[11] De Marneffe M C, Manning C D. Stanford Typed Dependencies Manual[EB/OL]. [2011-07-15]. http://nlp.stanford.edu/software/dependencies_manual.pdf.

[12] Bollegala D, Matsuo Y, Ishizuka M. Measuring Semantic Similarity Between Words Ysing Web Search Engines[C]. In: Proceedings of International World Wide Web Conference Committee(WWW 2007), Banff, Alberta, Canada.2007:757-766.

[13] [JP3]Turney P D. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[C]. In: Proceedings of the 12th European Conference on Machine Learning. London, UK: Springer-Verlag, 2001:491-502.[JP]

[14] Precision and Recall[EB/OL]. [2010-09-08]. http://en.wikipedia.org/wiki/Precision_and_recall.
[1] 关鹏,王曰芬,傅柱. 基于LDA的主题语义演化分析方法研究 * ——以锂离子电池领域为例[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[2] 孙海霞,王蕾,吴英杰,华薇娜,李军莲. 科技文献数据库中机构名称匹配策略研究*[J]. 数据分析与知识发现, 2018, 2(8): 88-97.
[3] 刘萍, 陈烨. 词汇相似度研究进展综述[J]. 现代图书情报技术, 2012, 28(7): 82-89.
[4] 李文江, 陈诗琴. AIMLBot智能机器人在实时虚拟参考咨询中的应用[J]. 现代图书情报技术, 2012, 28(7): 127-132.
[5] 董桂. 基于PostgreSQL的TMX数据存储研究与语料检索平台实现[J]. 现代图书情报技术, 2011, 27(7/8): 47-55.
[6] 王志超, 翁楠, 王宇. 基于主题句相似度的标题党新闻鉴别技术研究[J]. 现代图书情报技术, 2011, (11): 48-53.
[7] 王军辉, 胡铁军, 李丹亚. 相关文献检索研究综述[J]. 现代图书情报技术, 2011, 27(1): 39-45.
[8] 徐健 张智雄 肖卓 邓昭俊. 科技术语语义相似度计算方法研究综述[J]. 现代图书情报技术, 2010, 26(7/8): 51-57.
[9] 孙海霞 钱庆 吴英杰 李军莲. MeSH词表的语义相似度计算研究*[J]. 现代图书情报技术, 2010, 26(6): 12-16.
[10] 谢靖, 江岚, 王东波, 苏新宁. 基于万方数据(2003-2007)的知识发现应用研究[J]. 现代图书情报技术, 2010, 26(12): 64-69.
[11] 孙海霞,钱庆,成颖. 基于本体的语义相似度计算方法研究综述*[J]. 现代图书情报技术, 2010, 26(1): 51-56.
[12] 卢胜军,李法勇,钱建军,真溱. WCONS+:一种基于WCONS的本体集成[J]. 现代图书情报技术, 2009, 3(2): 18-22.
[13] 康小丽,章成志,王惠临. 基于可比语料库的双语术语抽取研究述评*[J]. 现代图书情报技术, 2009, (10): 7-13.
[14] 廉站俊,吕学强,张玉杰,施水才. 基于句子相似度计算的信息抽取*[J]. 现代图书情报技术, 2007, 2(6): 38-41.
[15] 宋琦,薛建武 . 智能检索中基于用户模型的本体映射方法研究[J]. 现代图书情报技术, 2006, 1(9): 29-33.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn