Please wait a minute...
New Technology of Library and Information Service  2011, Vol. 27 Issue (9): 28-33    DOI: 10.11925/infotech.1003-3513.2011.09.05
Current Issue | Archive | Adv Search |
A Term Similarity Algorithm Based on Context Dependency Relation Pattern
Xu Jian
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
Download: PDF(452 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  Based on the problems in typical term context similarity algorithm, the paper puts forward a new term similarity algorithm which constructs context patterns automatically by sentences dependencies analysis and then computes term similarity by mapping context patterns. The algorithm provides a better way to construct term context patterns. Meanwhile, term context characters are kept well in patterns. The paper also presents the specific implementation steps of new algorithm, and evaluates the algorithm on basis of gene engineering field experiment data set. Experiment result demonstrates that the algorithm has an obvious improvement in computing performance.
Key wordsTerm similarity      Context similarity      Similarity computation     
Received: 22 July 2011      Published: 02 December 2011
: 

G250.73

 

Cite this article:

Xu Jian. A Term Similarity Algorithm Based on Context Dependency Relation Pattern. New Technology of Library and Information Service, 2011, 27(9): 28-33.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2011.09.05     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2011/V27/I9/28

[1] Chen P, Lin S. Automatic Keyword Prediction Using Google Similarity Distance[J]. Expert Systems with Applications, 2010, 37(3): 1928-1938.

[2] Shehata S. A WordNet-based Semantic Model for Enhancing Text Clustering[C]. In:Proceedings of the 2009 IEEE International Conference on Data Mining Workshops. 2009: 477-482.

[3] Aimé X, Furst F, Kuntz P, et al. SemioSem: A Semiotic-based Similarity Measure[C]. In: Proceedings of the Confederated International Workshops and Posters on the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009. 2009: 584-593.

[4] [JP3]Dong H, Hussain F K, Chang E. A Hybrid Concept Similarity Measure Model for Ontology Environment[C]. In: Proceedings of the Confederated International Workshops and Posters on the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009. 2009: 848-857.[JP]

[5] Neshati M, Hassanabadi L S. Taxonomy Construction Using Compound Similarity Measure[C]. In: Proceedings of the 2007 OTM Confederated International Conference on the Move to Meaningful Internet Systems: CoopIS, DOA, ODBASE, GADA, and IS. 2007:915-932.

[6] Hindle D. Noun Classification from Predicate-argument Structures[C]. In: Proceedings of the 28th Annual Meeting on Association for Computational Linguistics.1990:268-275.

[7] Church K W, Hanks P. Word Association Norms, Mutual Information, and Lexicography[C]. In: Proceedings of the 27th Annual Meeting of ACL, Vancouver. 1989: 76-83.

[8] [JP3]Nenadic G, Spasic I, Ananiadou S. Automatic Discovery of Term Similarities Using Pattern Mining[C]. In: Proceedings of the 2nd International Workshop on Computational Terminology. 2002: 43-49. [JP]

[9] Stanford Dependencies [EB/OL]. [2011-07-15]. http://nlp.stanford.edu/software/stanford-dependencies.shtml.

[10] 徐健. 过滤阶段需保留的依赖关系类型列表[EB/OL]. [2011-07-15]. http://blog.sina.com.cn/s/blog_64b661270100 u8iz.html.

[11] De Marneffe M C, Manning C D. Stanford Typed Dependencies Manual[EB/OL]. [2011-07-15]. http://nlp.stanford.edu/software/dependencies_manual.pdf.

[12] Bollegala D, Matsuo Y, Ishizuka M. Measuring Semantic Similarity Between Words Ysing Web Search Engines[C]. In: Proceedings of International World Wide Web Conference Committee(WWW 2007), Banff, Alberta, Canada.2007:757-766.

[13] [JP3]Turney P D. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[C]. In: Proceedings of the 12th European Conference on Machine Learning. London, UK: Springer-Verlag, 2001:491-502.[JP]

[14] Precision and Recall[EB/OL]. [2010-09-08]. http://en.wikipedia.org/wiki/Precision_and_recall.
[1] Dong Gui. Research on PostgreSQL-based TMX Storage and Implementation of Corpus Retrieval Platform[J]. 现代图书情报技术, 2011, 27(7/8): 47-55.
[2] Wang Junhui, Hu Tiejun, Li Danya. Research Review of Related Articles Retrieval[J]. 现代图书情报技术, 2011, 27(1): 39-45.
[3] Lu Shengjun,Li Fayong,Qian Jianjun ,Zhen Zhen. WCONS+:An Ontology Integration Approach Based on WCONS[J]. 现代图书情报技术, 2009, 3(2): 18-22.
[4] Kang Xiaoli,Zhang Chengzhi,Wang Huilin. Survey on Bilingual Terminology Extraction from Comparable Corpora[J]. 现代图书情报技术, 2009, (10): 7-13.
[5] Song Qi,Xue Jianwu . Study of the Ontology Mapping Method Based on the User Model in the Intelligent Retrieval System[J]. 现代图书情报技术, 2006, 1(9): 29-33.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn