[Objective] This paper tries to add a topic layer between document and word layers, aiming to calculate word similarities effectively. [Methods] First, we proposed a topic defintion and representation model based on the theory of formal concept analysis. Then, we mapped words to the topic layer. Finally, we developed an algorithm to calculate word similarities with the help of topic-to-topic relationship.[Results] We analyzed papers of SIGIR conference from 2006 to 2016 with the proposed method to calculate word similarities in the field of information retrieval. The precision and recall of the proposed method were up to 30% and 21% higher than those of the FastText method.[Limitations] The proposed method relies on the quality of extracted feature words of documents.[Conclusions] The proposed method utilizes the semantic relations among associated topics, and effectively calculate word similarities.
刘萍,彭小芳. 基于形式概念分析的词汇相似度计算*[J]. 数据分析与知识发现, 2020, 4(5): 66-74.
Liu Ping,Peng Xiaofang. Calculating Word Similarities Based on Formal Concept Analysis. Data Analysis and Knowledge Discovery, 2020, 4(5): 66-74.
( Han Pu, Wang Dongbo, Wang Zimin. Research Advancement in Word Similarity Calculation and Mining[J]. Information Science, 2016,34(9):161-165.)
刘萍, 陈烨. 词汇相似度研究进展综述[J].现代图书情报技术, 2012(7):82-89.
( Liu Ping, Chen Ye. Survey of the State of the Art in Word Similarity[J].New Technology of Library and Information Service, 2012(7):82-89.)
Rada R, Mili H, Bicknell E, et al. Development and Application of a Metric on Semantic Nets[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1989,19(1):17-30.
Gao J B, Zhang B W, Chen X H. A WordNet-based Semantic Similarity Measurement Combining Edge-counting and Information Content Theory[J]. Engineering Applications of Artificial Intelligence, 2015,39:80-88.
( Lv Yawei, Li Fang, Dai Longlong. Chinese Word Similarity Computing Based on Latent Dirichlet Allocation(LDA) Model[J]. Journal of Beijing University of Chemical Technology: Natural Science Edition, 2016,43(5):79-83.)
Bollegala D, Matsuo Y, Ishizuka M. A Web Search Engine-Based Approach to Measure Semantic Similarity Between Words[J]. IEEE Transactions on Knowledge and Data Engineering, 2011,23(7):977-990.