Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (9): 35-40    DOI: 10.11925/infotech.1003-3513.2013.09.06
Current Issue | Archive | Adv Search |
Identifying Synonyms Based on Sentence Structure Analysis
Yu Juan1, Yin Jidong2, Fei Shu3
1. School of Public Administration and Policy, Fuzhou University, Fuzhou 350108, China;
2. Jiangxi Institute of Standardization, Nanchang 330029, China;
3. Library of Dalian Vocational & Technical College, Dalian 116035, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  A new method of identifying synonyms is proposed for the purpose of reducing the deviation when calculating the semantic similarity between two different terms or phrases. The method first analyzes sentence structures of the concerned terms (or phrases), and then calculates the semantic similarity between two terms (or phrases) based on Tongyici Cilin (a Chinese thesaurus). This method weights each word in the concerned terms (or phrases) equally to reduce identifying errors made by gravity-centre-backward methods. Experiments show that the proposed method of identifying synonyms is accurate and has good potentials for text mining and semantic retrieval applications.
Key wordsIdentifying synonyms      Sentence structure analysis      Text mining     
Received: 08 May 2013      Published: 27 September 2013
:  TP182  

Cite this article:

Yu Juan, Yin Jidong, Fei Shu. Identifying Synonyms Based on Sentence Structure Analysis. New Technology of Library and Information Service, 2013, 29(9): 35-40.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.09.06     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I9/35

[1] 宋明亮. 汉语词汇字面相似度性原理与后控制词表动态维护研究[J]. 情报学报, 1996, 15(4):261-271.(Song Mingliang. Research on Principle of Literal Similarity Among Chinese Words and Maintaining Post-controlled Vocabulary[J]. Journal of the China Society for Scientific and Technical Information, 1996, 15(4): 261-271.)
[2] 王源,吴晓滨,涂从文,等. 后控规范的计算机处理[J]. 现代图书情报技术, 1993(2): 4-7. (Wang Yuan, Wu Xiaobin, Tu Congwen, et al. Computer Processing of Post-control Indexing[J]. New Technology of Library and Information Service, 1993(2): 4-7.)
[3] 刘群, 李素建. 基于《知网》的词汇语义相似度计算[EB/OL]. [2013-08-22]. http://www.docin.com/p-23739023.html. (Liu Qun, Li Sujian. Word Similarity Computing Based on HowNet [EB/OL]. [2013-08-22].http://www.docin.com/p-23739023.html.)
[4] 朱毅华, 侯汉清, 沙印亭.计算机识别汉语同义词的两种算法比较和测评[J]. 中国图书馆学报, 2002, 28(4): 82-85. (Zhu Yihua, Hou Hanqing, Sha Yinting. A Comparison of Two Algorithms for Computer Recognition of Chinese Synonyms[J].Journal of Library Science in China, 2002, 28(4): 82-85.)
[5] 王兰成, 李超. 改进的中文同义词相似匹配方法[J]. 中国图书馆学报, 2005,31(3): 61-64.(Wang Lancheng, Li Chao. An Improved Chinese Synonym Similarity Matching Method[J]. Journal of Library Science in China, 2005,31(3): 61-64.)
[6] 余刚, 裴仰军, 朱征宇, 等. 基于词汇语义计算的文本相似度研究[J]. 计算机工程与设计, 2006, 27(2): 241-244.(Yu Gang, Pei Yangjun, Zhu Zhengyu, et al. Research of Text Similarity Based on Word Similarity Computing[J]. Computer Engineering and Design, 2006, 27(2): 241-244.)
[7] 穗志方, 俞士汶. 主题概念规范化研究中的自然语言处理策略[EB/OL]. [2013-08-22].http://icl.pku.edu.cn/icl_tr/collected_papers/chinese/collection-3/24-szf2.htm. (Sui Zhifang, Yu Shiwen. Natural Language Processing Strategy in the Standardization of Theme Concepts[EB/OL]. [2013-08-22].http://icl.pku.edu.cn/icl_tr/collected_papers/chinese/collection-3/24-szf2.htm.)
[8] 田久乐, 赵蔚. 基于同义词词林的词语相似度计算方法[J]. 吉林大学学报:信息科学版, 2010, 28(6): 602-608.(Tian Jiule, Zhao Wei. Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive Learning System[J]. Journal of Jilin University:Information Science Edition, 2010, 28(6): 602-608.)
[9] 于娟, 党延忠. 结合词性分析与串频统计的词语提取方法[J]. 系统工程理论与实践, 2010, 30(1): 105-111.(Yu Juan, Dang Yanzhong. Chinese Term Extraction Based on POS Analysis & String Frequency [J]. Systems Engineering—Theory & Practice, 2010, 30(1): 105-111.)
[10] 哈尔滨工业大学社会计算与信息检索研究中心. 哈工大停用词表 [EB/OL].[2013-05-30]. http://ir.hit.edu.cn/. (Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology. StopWords List[EB/OL].[2013-05-30]. http://ir.hit.edu.cn/.)
[11] 张华平, 刘群. 基于N-最短路径方法的中文词语粗分模型[J]. 中文信息学报, 2002, 16(5): 1-7. (Zhang Huaping, Liu Qun. Model of Chinese Words Rough Segmentation Based on N-Shortest-Paths Method[J]. Journal of Chinese Information Processing, 2002, 16(5): 1-7.)
[12] 刘群, 张华平, 俞鸿魁, 等. 基于层叠隐马模型的汉语词法分析[J]. 计算机研究与发展, 2004, 41(8): 1421-1429. (Liu Qun, Zhang Huaping, Yu Hongkui, et al. Chinese Lexical Analysis Using Cascaded Hidden Markov Model[J]. Journal of Computer Research and Development, 2004, 41(8): 1421-1429.)
[13] 张艳. 汉语句法分析的理论方法的研究及其应用[D]. 北京:中国科学院自动化研究所, 2003. (Zhang Yan. Research and Its Application of Chinese Syntactic Analysis Theoretical Methods[D]. Beijing: Institute of Automation,Chinese Academy of Sciences, 2003.)
[14] Liu T,Ma J,Li S.Building a Dependency Treebank for Improving Chinese Parser[J]. Journal of Chinese Language and Computing, 2006,16(4): 207-224.
[15] 哈尔滨工业大学社会计算与信息检索研究中心. 中文依存句法分析[EB/OL].[2013-01-16]. http://ir.hit.edu.cn/. (Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology. Chinese Dependency Parser[EB/OL]. [2013-01-16]. http://ir.hit.edu.cn/.)
[1] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] Xu Guang,Ren Ming,Song Chengyu. Extracting China’s Economic Image from Western News[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[3] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[4] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] Xia Tian. Extracting Key-phrases from Chinese Scholarly Papers[J]. 数据分析与知识发现, 2020, 4(7): 76-86.
[6] Du Jian. Measuring Uncertainty of Medical Knowledge: A Literature Review[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
[7] Peng Guan,Yuefen Wang. Advances in Patent Network[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[8] Mingxuan Huang,Shoudong Lu,Hui Xu. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[9] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[10] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
[11] Zhang Ning,Yin Lemin,He Lifeng. Impacts of “Poster-Follower” Sentiment on Stock Market Performance[J]. 数据分析与知识发现, 2018, 2(6): 1-12.
[12] Fan Xinyue,Cui Lei. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[13] Wang Qiangbing,Zhang Chengzhi. Constructing Users Profiles with Content and Gesture Behaviors[J]. 数据分析与知识发现, 2017, 1(2): 80-86.
[14] Xie Xiufang,Zhang Xiaolin. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
[15] Yao Zhaoxu,Ma Jing. Extracting Topic and Opinion from Microblog Posts with New Algorithm[J]. 现代图书情报技术, 2016, 32(7-8): 78-86.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn