Please wait a minute...
Advanced Search
现代图书情报技术  2013, Vol. 29 Issue (1): 50-56    DOI: 10.11925/infotech.1003-3513.2013.01.08
  情报分析与研究 本期目录 | 过刊浏览 | 高级检索 |
基于词汇链的路线图关键词抽取方法研究
叶春蕾1,2, 冷伏海2
1. 北京城市学院信息学部 北京 100094;
2. 中国科学院国家科学图书馆 北京 100190
Study on the Keyword Extraction from Roadmap Based on the Lexical Chains
Ye Chunlei1,2, Leng Fuhai2
1. Information Department, Beijing City University, Beijing 100094, China;
2. National Science Library, Chinese Academy of Sciences, Beijing 100190, China
全文: PDF(576 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 提出一种基于词汇链的关键词抽取方法。该方法通过构造词汇链来描述技术路线图的技术领域主题内容,并将词汇链作为表征技术路线图中领域关键词、核心技术关键词及其语义关系的词汇序列。实验表明该方法抽取的关键词能够更全面地揭示技术路线图的技术领域主题内容,其抽词结果的准确率和召回率较TF-IDF方法有明显的提高。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
叶春蕾
冷伏海
关键词 词汇链关键词抽取技术路线图TF-IDF    
Abstract:The paper proposes a method to extract the keyword based on the lexical chains. The method can describe the technical field topics in the technology roadmap by constructing lexical chains, and regard the lexical chains as semantic relations of keyword in the technical field. The experiment shows that this method can extract the keyword to reveal the content of technical field in technology roadmap more comprehensively, and can significantly improve the precision and recall rate than TF-IDF.
Key wordsLexical chains    Keyword extraction    Technology roadmap    TF-IDF
收稿日期: 2012-12-26     
:  G350  
通讯作者: 叶春蕾     E-mail: yechunlei@mail.las.ac.cn
引用本文:   
叶春蕾, 冷伏海. 基于词汇链的路线图关键词抽取方法研究[J]. 现代图书情报技术, 2013, 29(1): 50-56.
Ye Chunlei, Leng Fuhai. Study on the Keyword Extraction from Roadmap Based on the Lexical Chains. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2013.01.08.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2013.01.08
[1] Benchmarking the Performance of Two Automated Term Extraction Systems: LOGOS and ATAO [EB /OL].. http://olst.ling.umontreal.ca/pdf/memoirelove.pdf.
[2] Kajikawa Y, Sugiyama Y. Causal Knowledge Extraction by Natural Language Processing in Material Science: A Case Study in Chemical Vapor Deposition [J]. Data Science Journal, 2006(5): 108-118.
[3] 游宏梁, 张巍, 沈钧毅,等. 一种基于加权投票的术语自动识别方法[J]. 中文信息学报, 2011,25(3):6-16. (You Hongliang, Zhang Wei, Shen Junyi, et al. A Weighted Voting Based Automatic Term Recognition Method [J]. Journal of Chinese Information Processing, 2011, 25(3):6-16.)
[4] 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009,29(4):221-225. (Zhang Jing. Review and Prospect of Automatic Indexing [J]. Journal of Modem Information, 2009, 29(4):221-225.)
[5] Halliday M A K, Hasan R. Cohesion in English [M]. London, UK: Longman, 1976.
[6] Morris J, Hirst G. Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text [J]. Computational Linguistics, 1991, 17(1): 21-48.
[7] Silber H G, McCoy K F. Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization [J]. Computational Linguistics, 2002,28(4):487-496.
[8] Galley M, McKeown K. Improving Word Sense Disambiguation in Lexical Chaining [C]. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence,Acapulco, Mexico. 2003: 1486-1488.
[9] Ercan G, Cicekli I. Using Lexical Chains for Keyword Extraction [J]. Information Processing & Management, 2007, 43(6): 1705-1714.
[10] 索红光, 刘玉树, 曹淑英. 一种基于词汇链的关键词抽取方法[J]. 中文信息学报, 2006, 20(6): 25-30. (Suo Hongguang, Liu Yushu, Cao Shuying. A Keyword Selection Method Based on Lexical Chains [J]. Journal of Chinese Information Processing, 2006, 20(6):25-30.)
[11] 刘铭, 王晓龙, 刘远超. 基于词汇链的关键短语抽取方法的研究[J]. 计算机学报, 2010, 33(7):1246-1255. (Liu Ming, Wang Xiaolong, Liu Yuanchao. Research of Key-Phrase Extraction Based on Lexical Chain [J]. Chinese Journal of Computers, 2010, 33(7):1246-1255.)
[12] 胡学钢, 李星华, 谢飞, 等. 基于词汇链的中文新闻网页关键词抽取方法[J]. 模式识别与人工智能, 2010,23(1):45-51. (Hu Xuegang, Li Xinghua, Xie Fei, et al. Keyword Extraction Based on Lexical Chains for Chinese News Web Pages [J]. PR & AI, 2010, 23(1):45-51.)
[13] 宋培彦, 杨代庆. 基于语义网络的中文词汇链构造方法[J]. 图书情报工作, 2011,55(22):26-30. (Song Peiyan, Yang Daiqing. Constructing Chinese Lexical Chains Based on Semantic Network [J]. Library and Information Service, 2011, 55(22):26-30.)
[14] 裘江南, 罗志成, 王延章. 基于词汇链的应急预案主题抽取方法研究[J]. 情报学报, 2008, 27(6): 891-896. (Qiu Jiangnan, Luo Zhicheng, Wang Yanzhang. Reseach on Semantic Relatedness Based Subjects Extraction from Emergency Plans Literature [J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(6):891-896.)
[15] 王小捷, 常宝宝. 自然语言处理技术基础[M]. 北京: 北京邮电大学出版社, 2002. (Wang Xiaojie, Chang Baobao. Foundation of Natural Language Processing [M]. Beijing: Beijing University of Posts and Telecommunications Press, 2002.)
[16] Frantzi K T, Ananiadou S, Tsujii J I. The C-value/NC-value Method of Automatic Recognition for Multi-word Terms [C]. In: Proceedings of the 2nd European Conference on Research and Advanced Technology for Digital Libraries. 1998: 585-604.
[17] Callon M, Courtial J P, Laville F. Co-word Analysis as a Tool for Describing the Network of Interactions Between Basic and Technological Research: The Case of Polymer Chemistry[J]. Scientometrics,1991,22(1):155-205.
[18] Sebastiani F. Machine Learning in Automated Text Categorization [J]. ACM Computing Surveys, 2002, 34(1): 1-47.
[19] 黄承慧, 印鉴, 侯昉. 一种结合词项语义信息和TF-IDF方法的文本相似度量方法[J]. 计算机学报, 2011, 34(5): 856-864. (Huang Chenghui, Yin Jian, Hou Fang. A Text Similarity Measurement Combining Word Semantic Information with TF-IDF Method [J]. Chinese Journal of Computers, 2011, 34(5):856-864.)
[20] Meador M A, Files B, Li J, et al. Draft Nanotechnology Roadmap: Technology Area 10 [R]. National Aeronautics and Space Administration, 2010.
[21] 章成志, 周冬敏. 自动标引通用评价模型研究[J]. 情报学报, 2009, 28(1):40-47. (Zhang Chengzhi, Zhou Dongmin. General Evaluation Model for Automatic Indexing [J]. Journal of the China Society for Scientific and Technical Information, 2009, 28(1):40-47.)
[22] van Rijsbergen C J. Information Retrieval [M]. Newton, MA, USA: Butterworth-Heinemann,1979.
[1] 张震,曾金. 面向用户评论的关键词抽取研究*——以美团为例[J]. 数据分析与知识发现, 2019, 3(3): 36-44.
[2] 殷聪,张李义. 基于TF-IDF的情境后过滤推荐算法研究*——以餐饮业O2O为例[J]. 数据分析与知识发现, 2018, 2(11): 28-36.
[3] 李昌兵,庞崇鹏,李美平. 基于权重的Apriori算法在文本统计特征提取方法中的应用*[J]. 数据分析与知识发现, 2017, 1(9): 83-89.
[4] 何跃,肖敏,张月. 结合话题相关性的热点话题情感倾向研究*[J]. 数据分析与知识发现, 2017, 1(3): 46-53.
[5] 曲云鹏,王文玲. 一种分布式语义增强的词汇链文本表示模型构建方法[J]. 现代图书情报技术, 2016, 32(9): 34-41.
[6] 王培霞,余海,陈力,王永吉. 科技查新中检索词智能抽取系统的设计与实现*[J]. 现代图书情报技术, 2016, 32(11): 82-93.
[7] 徐冬冬, 吴韶波. 一种基于类别描述的TF-IDF特征选择方法的改进[J]. 现代图书情报技术, 2015, 31(3): 39-48.
[8] 夏天. 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013, 29(9): 30-34.
[9] 路永和, 李焰锋. 多因素影响的特征选择方法[J]. 现代图书情报技术, 2013, (5): 34-39.
[10] 叶春蕾, 冷伏海. 技术路线图中未来技术词表构建方法研究[J]. 现代图书情报技术, 2013, (5): 59-63.
[11] 覃世安, 李法运. 文本分类中TF-IDF方法的改进研究[J]. 现代图书情报技术, 2013, 29(10): 27-30.
[12] 谷俊, 王昊. 基于领域中文文本的术语抽取方法研究[J]. 现代图书情报技术, 2011, 27(4): 29-34.
[13] 王昊, 邓三鸿, 苏新宁. 基于字序列标注的中文关键词抽取研究[J]. 现代图书情报技术, 2011, 27(12): 39-45.
[14] 梁文超, 徐朝军, 沈书生. 模糊规则算法在教育信息分类中的应用[J]. 现代图书情报技术, 2011, 27(1): 94-98.
[15] 许德山, 张智雄, 王峰, 邢美凤. 上下文分析与统计特征相结合的英文术语抽取研究[J]. 现代图书情报技术, 2010, 26(12): 28-33.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn