Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (9): 34-41    DOI: 10.11925/infotech.1003-3513.2016.09.04
Orginal Article Current Issue | Archive | Adv Search |
Using Semantic Model to Build Lexical Chains
Qu Yunpeng1,2,3(),Wang Wenling3
1University of Chinese Academy of Sciences, Beijing 100049, China
2National Science Library, Chinese Academy of Sciences, Beijing 100190, China
3National Library of China, Beijing 100081, China
Download: PDF(442 KB)   HTML ( 22
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper uses Distributional Semantics to build high quality lexical chains. [Methods] First, we built an algorithm using WordNet Thesaurus to compute the semantic relations among language units of the texts. Second, we adopted the Distributional Memory Model to compute their latent semantic relations. Finally, we combined these relations to build the lexical chains, which were examined with papers from medical science. [Results] The proposed algorithm was better than the non-greedy methods to describe the papers’ topics. [Limitations] The efficiency of the algorithm needs to be improved. It should also be examined with papers from other fields. [Conclusions] The proposed model can detect the latent semantic relation, and then improve the quality of lexical chains building with phrases.

Key wordsWordNet      Distributional Memory      Lexical Chain      Distributional Semantics     
Received: 08 April 2016      Published: 19 October 2016

Cite this article:

Qu Yunpeng,Wang Wenling. Using Semantic Model to Build Lexical Chains. New Technology of Library and Information Service, 2016, 32(9): 34-41.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.09.04     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I9/34

[1] Manabu O, Takeo H.Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion [C]. In: Proceedings of the 15th Conference on Computational Linguistics-Volume 2. Stroudsburg: Association for Computational Linguistics, 1994: 755-761.
[2] Barzilay R, Elhadad M.Using Lexical Chains for Text Summarization [A]. // Mani I, Maybury M T. Advances in Automatic Text Summarization[M].Cambridge: MIT Press, 1999: 357-380.
[3] Li S, You W, Li T, et al.Lexical-chain and It’s Application in Text Filtering [C]. In: Proceedings of the International Conference on Information Technology: Coding and Computing. Washington: IEEE Computer Society, 2004: 288-292.
[4] Moldovan D, Novischi A.Lexical Chains for Question Answering [C]. In: Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Stroudsburg: Association for Computational Linguistics, 2002: 1-7.
[5] St-Onge D.Detecting and Correcting Malapropisms with Lexical Chains [D]. Toronto: University of Toronto, 1995.
[6] Naveen Kumar M, Suresh R.Emotion Detection Using Lexical Chains[J]. International Journal of Computer Applications, 2012, 57(4): 1-4.
[7] 曲云鹏, 王文玲. 词汇链文本表示模型计算方法综述[J]. 知识管理论坛, 2016(2): 136-144.
[7] (Qu Yunpeng, Wang Wenling.An Overview on the Computing Method of the Lexical Chain Text Representation[J]. Knowledge Management Forum, 2016(2): 136-144.)
[8] Hirst G, St-Onge D.Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms[J]. Lecture Notes in Physics, 1995, 728(9): 123-149.
[9] Morris J, Hirst G.Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text[J]. Computational Linguistics, 1991, 17(1): 21-48.
[10] 刘铭, 王晓龙, 刘远超. 基于词汇链的关键短语抽取方法的研究[J]. 计算机学报, 2010, 33(7): 1246-1255.
[10] (Liu Ming, Wang Xiaolong, Liu Yuanchao.Research of Key-Phrase Extraction Based on Lexical Chain[J]. Chinese Journal of Computers, 2010, 33(7): 1246-1255.)
[11] 胡学钢, 李星华, 谢飞, 等. 基于词汇链的中文新闻网页关键词抽取方法[J]. 模式识别与人工智能, 2010, 23(1): 45-51.
[11] (Hu Xuegang, Li Xinghua, Xie Fei, et al.Keyword Extraction Based on Lexical Chains for Chinese News Web Pages[J]. Pattern Recognition and Artificial Intelligence, 2010, 23(1): 45-51.)
[12] 裘江南, 罗志成, 王延章. 基于词汇链的应急预案主题抽取方法研究[J]. 情报学报, 2008, 27(6): 891-896.
[12] (Qiu Jiangnan, Luo Zhicheng, Wang Yanzhang.Research on Semantic Relatedness Based Subjects Extraction from Emergency Plans Literature[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(6): 891-896.)
[13] Dias G, Santos C, Cleuziou G.Automatic Knowledge Representation Using a Graph-based Algorithm for Language-independent Lexical Chaining [C]. In: Proceedings of the Workshop on Information Extraction Beyond the Document. Stroudsburg: Association for Computational Linguistics, 2006: 36-47.
[14] Remus S, Biemann C.Three Knowledge-free Methods for Automatic Lexical Chain Extraction [C]. In: Proceedings of NAACL-HLT 2013. Stroudsburg: Association for Computational Linguistics, 2013: 989-999.
[15] 叶春蕾, 冷伏海. 基于词汇链的路线图关键词抽取方法研究[J]. 现代图书情报技术, 2013(1): 50-56.
[15] (Ye Chunlei, Leng Fuhai.Study on the Keyword Extraction from Roadmap Based on the Lexical Chains[J]. New Technology of Library and Information Service, 2013(1): 50-56.)
[16] Medelyan O.Computing Lexical Chains with Graph Clustering [C]. In: Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop. Stroudsburg: Association for Computational Linguistics, 2007: 85-90.
[17] Marathe M, Hirst G.Lexical Chains Using Distributional Measures of Concept Distance [C]. In: Proceedings of the 11th International Conference on Computational Linguistics. 2010: 291-302.
[18] Basili R, Pennacchiotti M.Distributional Lexical Semantics: Toward Uniform Representation Paradigms for Advanced Acquisition and Processing Tasks[J]. Natural Language Engineering, 2010, 16(4): 347-358.
[19] Molino P, Basile P, Caputo A, et al.Exploiting Distributional Semantic Models in Question Answering [C]. In: Proceedings of the 2012 IEEE 6th International Conference on Semantic Computing. Washington, DC: IEEE Computer Society, 2012: 146-153.
[20] Padó S, Lapata M.Dependency-based Construction of Semantic Space Models[J]. Computational Linguistics, 2007, 33(2): 161-199.
[21] Landauer T K, Dumais S T.A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge[J]. Psychological Review, 1997, 104(2): 211-240.
[22] Sahlgren M.An Introduction to Random Indexing [C]. In: Proceedings of Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, Copenhagen, Denmark. 2005.
[23] Baroni M, Lenci A. One Distributional Memory, Many Semantic Spaces [C]. In: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics. Stroudsburg, PA: Association for Computational Linguistics, 2009: 1-8.
[24] Baroni M, Lenci A.Distributional Memory: A General Framework for Corpus-based Semantics[J]. Computational Linguistics, 2010, 36(4): 673-721.
[25] Padó S, Utt J.A Distributional Memory for German [C]. In: Proceedings of the KONVENS 2012. 2012: 462-470.
[26] ?najder J, Padó S, Agi? ?.Building and Evaluating a Distributional Memory for Croatian [C]. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.2013: 784-789.
[27] De Marneffe M-C, Manning C D. Stanford Typed Dependencies Manual [EB/OL]. [2016-04-07]. .
[28] Evert S.The Statistics of Word Cooccurrences [Elektronische Ressource]: Word Pairs and Collocations [D]. Stuttgart: University of Stuttgart, 2005.
[29] Turney P D, Pantel P.From Frequency to Meaning: Vector Space Models of Semantics[J]. Journal of Artificial Intelligence Research, 2010, 37(4): 141-188.
[30] Fellbaum C, Miller G.WordNet: An Electronic Lexical Database [M]. Cambridge, MA: MIT Press, 1998.
[31] Silber H G, Mccoy K F.Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization[J]. Computational Linguistics, 2002, 28(4): 487-496.
[32] Barzilay R, Elhadad M.Using Lexical Chains for Text Summarization [C]. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization. 1997: 10-17.
[33] Manning C D, Surdeanu M, Bauer J, et al.The Stanford CoreNLP Natural Language Processing Toolkit [C]. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2014: 55-60.
[34] Hoey M.Patterns of Lexis in Text [M]. Oxford University Press, 1991.
[1] Ye Chunlei, Leng Fuhai. Study on the Keyword Extraction from Roadmap Based on the Lexical Chains[J]. 现代图书情报技术, 2013, 29(1): 50-56.
[2] Bai Rujiang, Yu Xiaofan, Wang Xiaoyue. The Comparative Analysis of Major Domestic and Foreign Ontology Library[J]. 现代图书情报技术, 2011, 27(1): 3-13.
[3] Wang Xiaoyue, Hu Zewen, Bai Rujiang. Study on the Mapping Mechanism Between WordNet and SUMO Ontology[J]. 现代图书情报技术, 2011, 27(1): 22-30.
[4] Hu Zewen, Wang Xiaoyue, Bai Rujiang. Study on Text Classification Model Based on SUMO and WordNet Ontology Integration[J]. 现代图书情报技术, 2011, 27(1): 31-38.
[5] Zhai Dongsheng,Liu Chen,Ouyang Yihui. The Design and Implementation of Patent Information Acquiring and Analysis System[J]. 现代图书情报技术, 2009, 25(5): 55-60.
[6] Rao Yanghui,Ye Liang,Cheng Jie. Research on the Application of WordNet in Text Clustering[J]. 现代图书情报技术, 2009, (10): 67-70.
[7] Jia Junzhi,Dong Gang. The Study on Integration of CFN and VerbNet,WordNet[J]. 现代图书情报技术, 2008, 24(6): 6-10.
[8] Zhang Huiping,Lv Xueqiang,Shi Shuicai,Li Yuqin . Constructing Semantic Distribution Dictionary Based on WordNet[J]. 现代图书情报技术, 2007, 2(3): 55-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn