|
|
Chinese Synonyms Discovery Based on the Term Definition |
Yin Xihong, Qiao Xiaodong, Zhang Yunliang |
Institute of Scientific & Technical Information of China, Beijing 100038, China |
|
|
Abstract [Objective] Enlightened by Lesk's research about sense disambiguation, an approach based on the term definition to find synonyms is proposed. [Methods] This experiment set up the test set on the Chinese scientific and technical vocabulary system(new energy vehicles). First the Chinese word segmentation, part-of-speech tagging and manual correction of term definition are given. Then verbs and nouns content words are extracted, and the similarity of two terms is calculated according to the number of terms defined in the same content words and the position of the same content words. At last, according to the similarity and given threshold, the synonym relations are recommended. [Results] The precision, recall, F value is used to evaluate the effect of synonyms found, to demonstrate the effectiveness of this method. The result shows that the method can achieve a high precision, but the recall is low. [Limitations] This method can not exclude terms with antisense relationships or related relationships, resulting in lower recall rate. [Conclusions] This method is simple and more effective, and can achieve a high accuracy, while higher recall rate is expected.
|
Received: 06 January 2014
Published: 19 May 2014
|
|
[1] Wei X,Peng F C,Tseng H.Search with Synonyms:Problems and Solutions[C].In:Proceedings of the International Conference on Computational Linguistics (COLING).2010:1318-1326.
[2] Gnoli C.Ten Long-term Research Questions in Knowledge Organization[J].Knowledge Organization,2008,35(2-3):137-149.
[3] 陆勇,侯汉清.用于信息检索的同义词自动识别及其进展[J].南京农业大学学报:社会科学版,2004,4(3):87-93.(Lu Yong,Hou Hanqing.Synonyms Automatic Identification and Progress for Information Retrieval[J].Journal of Nanjing Agricultural University:Social Science Edition,2004,4(3):87-93.)
[4] 钟伟金.基于共现"互斥互信"原理的同义词识别[J].中华医学图书情报杂志,2012,21(5):1-4.(Zhong Weijin."Mutual Exclusion and Mutual Trust" Co-Occurrence Principle-Based Identification of Synonyms[J].Chinese Journal of Medical Library and Information Science,2012,21(5):1-4.)
[5] Grushetsky O,Baker S D.Document-based Synonym Generation:United States,US7890521 B1[P].(2011-02-15).[2013-11-15].http://www.google.com.tw/patents/US7890521.
[6] 宋丹,师庆辉,薛德军,等.术语同义词的自动抽取[C].见:第三届全国信息检索与内容安全学术会议,苏州,江苏.2007.(Song Dan,Shi Qinghui,Xue Dejun,et al.Automation Extraction of Similar Term[C].In:Proceedings of the 3rdNational Information Retrieval and Content Security Conference,Suzhou,Jiangsu.2007.)
[7] 孙霞,董乐红.基于监督学习的同义关系自动抽取方法[J].西北大学学报:自然科学版,2008,38(1):35-39.(Sun Xia,Dong Lehong.Automatic Extraction Method of Synonym Relationship Based on Supervised Learning[J].Journal of Northwest University:Natural Science Edition,2008,38(1):35-39.)
[8] Muller P,Langlais P.Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms[C].In:Proceedings of the 24th Canadian Conference on Advances in Artificial Intelligence.Berlin,Heidelberg:Springer-Verlag,2011:323-334.
[9] Van der Plas L,Tiedemann J,Manguin J L.Automatic Acquisition of Synonyms for French Using Parallel Corpora[C].In:Proceedings of the 4th International Workshop on Distributed Agent-based Retrieval Tools.2010.
[10] Meusel R,Niepert M,Eckert K,et al.Thesaurus Extension Using Web Search Engines[C].In:Proceedings of the Role of Digital Libraries in a Time of Global Change,and 12th International Conference on Asia-Pacific Digital Libraries.2010:198-207.
[11] 张运良,乔晓东,朱礼军,等.基于术语翻译信息的同义关系快速构建方法研究[J].图书情报工作,2013,57(8):109-113.(Zhang Yunliang,Qiao Xiaodong,Zhu Lijun,et al.Rapid Construction Method of Synonym Relationship Based on Terms Translation Information[J].Library and Information Service,2013,57(8):109-113.)
[12] Muller P,Hathout N,Gaume B.Synonym Extraction Using a Semantic Distance on a Dictionary[C].In:Proceedings of the 1st Workshop on Graph Based Methods for Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2006:65-72.
[13] 陆勇,侯汉清.基于PageRank值的汉语同义词自动识别[J].西华大学学报:自然科学版,2008,27(2):13-15,94.( Lu Yong,Hou Hanqing.Automatic Recognition of Chinese Synonyms Based on PageRank Algorithm[J].Journal of Xihua University:Natural Science Edition,2008,27(2):13-15,94.)
[14] Wu H,Zhou M.Optimizing Synonym Extraction Using Monolingual and Bilingual Resource[C].In:Proceedings of the 2nd International Workshop on Paraphrasing.Stroudsburg:Association for Computational Linguistics,2003:72-79.
[15] Lesk M.Information in Data:Using the Oxford English Dictionary on a Computer[J].SIGIR Forum,1986,20(1-4):18-21.
[16] Banerjee S,Pedersen T.Extended Gloss Overlaps as a Measure of Semantic Relatedness[C].In:Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI'03).2003:805-810.
[17] 朱毅华.智能搜索引擎中的同义词识别算法研究[D].南京:南京农业大学,2001.(Zhu Yihua.Automatic Recognition of Synonym in Construction of Intelligent Search Engine[D].Nanjing:Nanjing Agriculture University,2001.)
[18] 贺德方,乔晓东,朱礼军,等.汉语科技词系统(新能源汽车卷)[M].北京:科学技术文献出版社,2012.(He Defang,Qiao Xiaodong,Zhu Lijun,et al.Chinese Scientific and Technical Vocabulary System (New Energy Vehicles)[M].Beijing:Scientific and Technical Documentation Press,2012.) |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|