Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (1): 50-56    DOI: 10.11925/infotech.1003-3513.2013.01.08
Current Issue | Archive | Adv Search |
Study on the Keyword Extraction from Roadmap Based on the Lexical Chains
Ye Chunlei1,2, Leng Fuhai2
1. Information Department, Beijing City University, Beijing 100094, China;
2. National Science Library, Chinese Academy of Sciences, Beijing 100190, China
Download: PDF(576 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  The paper proposes a method to extract the keyword based on the lexical chains. The method can describe the technical field topics in the technology roadmap by constructing lexical chains, and regard the lexical chains as semantic relations of keyword in the technical field. The experiment shows that this method can extract the keyword to reveal the content of technical field in technology roadmap more comprehensively, and can significantly improve the precision and recall rate than TF-IDF.
Key wordsLexical chains      Keyword extraction      Technology roadmap      TF-IDF     
Received: 26 December 2012      Published: 29 March 2013
:  G350  

Cite this article:

Ye Chunlei, Leng Fuhai. Study on the Keyword Extraction from Roadmap Based on the Lexical Chains. New Technology of Library and Information Service, 2013, 29(1): 50-56.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.01.08     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I1/50

[1] Benchmarking the Performance of Two Automated Term Extraction Systems: LOGOS and ATAO [EB /OL].. http://olst.ling.umontreal.ca/pdf/memoirelove.pdf.
[2] Kajikawa Y, Sugiyama Y. Causal Knowledge Extraction by Natural Language Processing in Material Science: A Case Study in Chemical Vapor Deposition [J]. Data Science Journal, 2006(5): 108-118.
[3] 游宏梁, 张巍, 沈钧毅,等. 一种基于加权投票的术语自动识别方法[J]. 中文信息学报, 2011,25(3):6-16. (You Hongliang, Zhang Wei, Shen Junyi, et al. A Weighted Voting Based Automatic Term Recognition Method [J]. Journal of Chinese Information Processing, 2011, 25(3):6-16.)
[4] 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009,29(4):221-225. (Zhang Jing. Review and Prospect of Automatic Indexing [J]. Journal of Modem Information, 2009, 29(4):221-225.)
[5] Halliday M A K, Hasan R. Cohesion in English [M]. London, UK: Longman, 1976.
[6] Morris J, Hirst G. Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text [J]. Computational Linguistics, 1991, 17(1): 21-48.
[7] Silber H G, McCoy K F. Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization [J]. Computational Linguistics, 2002,28(4):487-496.
[8] Galley M, McKeown K. Improving Word Sense Disambiguation in Lexical Chaining [C]. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence,Acapulco, Mexico. 2003: 1486-1488.
[9] Ercan G, Cicekli I. Using Lexical Chains for Keyword Extraction [J]. Information Processing & Management, 2007, 43(6): 1705-1714.
[10] 索红光, 刘玉树, 曹淑英. 一种基于词汇链的关键词抽取方法[J]. 中文信息学报, 2006, 20(6): 25-30. (Suo Hongguang, Liu Yushu, Cao Shuying. A Keyword Selection Method Based on Lexical Chains [J]. Journal of Chinese Information Processing, 2006, 20(6):25-30.)
[11] 刘铭, 王晓龙, 刘远超. 基于词汇链的关键短语抽取方法的研究[J]. 计算机学报, 2010, 33(7):1246-1255. (Liu Ming, Wang Xiaolong, Liu Yuanchao. Research of Key-Phrase Extraction Based on Lexical Chain [J]. Chinese Journal of Computers, 2010, 33(7):1246-1255.)
[12] 胡学钢, 李星华, 谢飞, 等. 基于词汇链的中文新闻网页关键词抽取方法[J]. 模式识别与人工智能, 2010,23(1):45-51. (Hu Xuegang, Li Xinghua, Xie Fei, et al. Keyword Extraction Based on Lexical Chains for Chinese News Web Pages [J]. PR & AI, 2010, 23(1):45-51.)
[13] 宋培彦, 杨代庆. 基于语义网络的中文词汇链构造方法[J]. 图书情报工作, 2011,55(22):26-30. (Song Peiyan, Yang Daiqing. Constructing Chinese Lexical Chains Based on Semantic Network [J]. Library and Information Service, 2011, 55(22):26-30.)
[14] 裘江南, 罗志成, 王延章. 基于词汇链的应急预案主题抽取方法研究[J]. 情报学报, 2008, 27(6): 891-896. (Qiu Jiangnan, Luo Zhicheng, Wang Yanzhang. Reseach on Semantic Relatedness Based Subjects Extraction from Emergency Plans Literature [J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(6):891-896.)
[15] 王小捷, 常宝宝. 自然语言处理技术基础[M]. 北京: 北京邮电大学出版社, 2002. (Wang Xiaojie, Chang Baobao. Foundation of Natural Language Processing [M]. Beijing: Beijing University of Posts and Telecommunications Press, 2002.)
[16] Frantzi K T, Ananiadou S, Tsujii J I. The C-value/NC-value Method of Automatic Recognition for Multi-word Terms [C]. In: Proceedings of the 2nd European Conference on Research and Advanced Technology for Digital Libraries. 1998: 585-604.
[17] Callon M, Courtial J P, Laville F. Co-word Analysis as a Tool for Describing the Network of Interactions Between Basic and Technological Research: The Case of Polymer Chemistry[J]. Scientometrics,1991,22(1):155-205.
[18] Sebastiani F. Machine Learning in Automated Text Categorization [J]. ACM Computing Surveys, 2002, 34(1): 1-47.
[19] 黄承慧, 印鉴, 侯昉. 一种结合词项语义信息和TF-IDF方法的文本相似度量方法[J]. 计算机学报, 2011, 34(5): 856-864. (Huang Chenghui, Yin Jian, Hou Fang. A Text Similarity Measurement Combining Word Semantic Information with TF-IDF Method [J]. Chinese Journal of Computers, 2011, 34(5):856-864.)
[20] Meador M A, Files B, Li J, et al. Draft Nanotechnology Roadmap: Technology Area 10 [R]. National Aeronautics and Space Administration, 2010.
[21] 章成志, 周冬敏. 自动标引通用评价模型研究[J]. 情报学报, 2009, 28(1):40-47. (Zhang Chengzhi, Zhou Dongmin. General Evaluation Model for Automatic Indexing [J]. Journal of the China Society for Scientific and Technical Information, 2009, 28(1):40-47.)
[22] van Rijsbergen C J. Information Retrieval [M]. Newton, MA, USA: Butterworth-Heinemann,1979.
[1] Zhuchen Liu,Hao Chen,Yanhua Yu,Jie Li. Extracting Keywords with TextRank and Weighted Word Positions[J]. 数据分析与知识发现, 2018, 2(9): 74-79.
[2] Cong Yin,Liyi Zhang. Recommendation Algorithm for Post-Context Filtering Based on TF-IDF: Case Study of Catering O2O[J]. 数据分析与知识发现, 2018, 2(11): 28-36.
[3] Changbing Li,Chongpeng Pang,Meiping Li. Extracting Product Features with Weight-based Apriori Algorithm[J]. 数据分析与知识发现, 2017, 1(9): 83-89.
[4] Yue He,Min Xiao,Yue Zhang. Sentiment Analysis of Trending Topics Based on Relevance[J]. 数据分析与知识发现, 2017, 1(3): 46-53.
[5] Tian Xia. Extracting Keywords with Modified TextRank Model[J]. 数据分析与知识发现, 2017, 1(2): 28-34.
[6] Xiufang Xie,Xiaolin Zhang. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
[7] Ning Jianfei,Liu Jiangzhen. Using Word2vec with TextRank to Extract Keywords[J]. 现代图书情报技术, 2016, 32(6): 20-27.
[8] Xu Dongdong, Wu Shaobo. An Improved TF-IDF Feature Selection Based on Categorical Description[J]. 现代图书情报技术, 2015, 31(3): 39-48.
[9] Gu Yijun, Xia Tian. Study on Keyword Extraction with LDA and TextRank Combination[J]. 现代图书情报技术, 2014, 30(7): 41-47.
[10] Xia Tian. Study on Keyword Extraction Using Word Position Weighted TextRank[J]. 现代图书情报技术, 2013, 29(9): 30-34.
[11] Lu Yonghe, Li Yanfeng. A Feature Selection Based on Consideration of Multiple Factors[J]. 现代图书情报技术, 2013, (5): 34-39.
[12] Ye Chunlei, Leng Fuhai. Building the Future-oriented Technology Thesaurus of Technology Roadmap[J]. 现代图书情报技术, 2013, (5): 59-63.
[13] Qin Shian, Li Fayun. Improved TF-IDF Method in Text Classification[J]. 现代图书情报技术, 2013, 29(10): 27-30.
[14] Gu Jun, Wang Hao. Study on Term Extraction on the Basis of Chinese Domain Texts[J]. 现代图书情报技术, 2011, 27(4): 29-34.
[15] Liang Wenchao, Xu Chaojun, Shen Shusheng. Application of the Fuzzy Rule Algorithm in the Classification of Educational Information[J]. 现代图书情报技术, 2011, 27(1): 94-98.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn