Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (3): 26-32    DOI: 10.11925/infotech.1003-3513.2015.03.04
Current Issue | Archive | Adv Search |
Patent Keyword Indexing Based on Weighted Complex Graph Model
Li Junfeng, Lv Xueqiang, Zhou Shaojun
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science & Technology University, Beijing 100101, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Patent keyword indexing plays an important role in nature language processing and is widely applied in many fields, such as patent retrieval, translation and automatic summary. [Methods] Using K-proximity coupled graph to transfer patents into complex graph model, and average connectivity weight is proposed with the average path variation, the average clustering coefficient, and the current node's liquidity effect. Considering the location information, the word-gap information and the inverse document frequency of keywords, a patent comprehensive correlation calculation method for quantitative analysis of keyword importance is proposed. [Results] Experiment of patent literatures in sensor domain obtains the precision of 60.9% on top-8, and the recall rate of 73.4% on top-10. [Limitations] The result of keywords with low frequency is not good enough, which affects the indexing result. [Conclusions] Experimental results show that this method is effective and has active significance for patent indexing.

Key wordsComplex graph model      Topology potential      Keyword indexing      Average connectivity weight      Comprehensive correlation     
Received: 13 August 2014      Published: 16 April 2015
:  TP391.1  

Cite this article:

Li Junfeng, Lv Xueqiang, Zhou Shaojun. Patent Keyword Indexing Based on Weighted Complex Graph Model. New Technology of Library and Information Service, 2015, 31(3): 26-32.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.03.04     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I3/26

[1] 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009, 29(4): 221-225. (Zhang Jing. Review and Prospect of Automatic Indexing [J]. Journal of Modern Information, 2009, 29(4): 221-225.)
[2] Fujii A, Utiyama M, Yamamoto M, et al. Overview of the Patent Translation Task at the NTCIR-7 Workshop [C]. In: Proceedings of the 7th NII Testbeds and Community for Information Access Research Workshop Meeting, Tokyo, Japan. Tokyo: National Institude of Informatics, 2008: 389-400.
[3] Wartena C, Brussee R, Slakhorst W. Keyword Extraction Using Word Co-occurrence[C]. In: Proceedings of 2010 Workshop on Database and Expert Systems Applications (DEXA), Bilbao, Spain. New York, USA: IEEE, 2010: 54-58.
[4] 罗准辰, 王挺. 基于分离模型的中文关键词提取算法研究[J]. 中文信息学报, 2009, 23(1): 63-70. (Luo Zhunchen, Wang Ting. Research on the Chinese Keyword Extraction Algorithm Based on Separate Models [J]. Journal of Chinese Information Processing, 2009, 23(1): 63-70.)
[5] 索红光, 刘玉树, 曹淑英. 一种基于词汇链的关键词抽取方法[J]. 中文信息学报, 2006, 20(6): 25-30. (Suo Hongguang, Liu Yushu, Cao Shuying. A Keyword Selection Method Based on Lexical Chains [J]. Journal of Chinese Information Processing, 2006, 20(6): 25-30.)
[6] Noh Y, Son J W, Park S B. Keyword Extraction from Dialogue Sentences Using Semantic and Topical Relatedness [C]. In: Proceedings of the 20th International Conference on Neural Information Processing, Daegu, Korea. Berlin: Springer-Verlag, 2013: 129-136.
[7] 章成志. 基于集成学习的自动标引方法研究[J]. 中国索引, 2009, 7(2): 16-23. (Zhang Chengzhi. Automatic Indexing Method Based on Ensemble Learning [J]. Journal of the China Society of Indexers, 2009, 7(2): 16-23.)
[8] Chen X, Peng Z, Zeng C. A Co-training Based Method for Chinese Patent Semantic Annotation[C]. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2012: 2379-2382.
[9] 马力, 焦李成, 白琳, 等. 基于小世界模型的复合关键词提取方法研究[J]. 中文信息学报, 2009, 23(3): 121-128. (Ma Li, Jiao Licheng, Bai Lin, et al. Research on a Compound Keywords Detection Method Based on Small World Model [J]. Journal of Chinese Information Processing, 2009, 23(3): 121-128.)
[10] 翟周伟, 刘刚, 吕玉琴. 基于图模型的关键词挖掘方法[J]. 软件, 2012, 33(8): 9-13. (Zhai Zhouwei, Liu Gang, Lv Yuqin. Keywords Mining Method Based on Graph Model [J]. Software, 2012, 33(8): 9-13.)
[11] 夏天. 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013(9): 30-34. (Xia Tian. Study on Keyword Extraction Using Word Position Weight TextRank [J]. New Technology of Library and Information Service, 2013(9): 30-34.)
[12] Wang S, Hauskrecht M. Keyword Annotationof Biomedical Documents with Graph-based Similarity Methods [C]. In:Proceedings of the 2012 IEEE International Conference on BioInformatics and BioMedicine (BIBM), Philadelphia, PA, USA. IEEE, 2012: 1-4.
[13] 于少然. 网络拓扑结构中节点重要性评价方法的研究[D]. 北京: 北京交通大学, 2012. (Yu Shaoran. The Research of Node Importance Analysis in the Networks Topology [D]. Beijing: Beijing Jiaotong University, 2012.)
[14] Yang Y, Zhao T, Lu Q, et al. Chinese Term Extraction Using Different Types of Relevance [C]. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Suntec, Singapore. Philadelphia, PA, USA: Association for Computational Linguistics, 2009: 213-216.
[15] Ventura J A L, Jonquet C, Roche M, et al. Combining C-value and Keyword Extraction Methods for Biomedical Terms Ex­traction [C]. In: Proceedings of the 5th International Sy­m­posium on Languages in Biology and Medicine, Tokyo, Japan. Database Center for Life Science Technology, 2013: 45-49.

[1] Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[2] Liu Huan, Zhang Zhixiong, Wang Yufei. A Review on Main Optimization Methods of BERT [J]. 数据分析与知识发现, 0, (): 1-.
[3] Ye Guanghui, Xu Tong, Bi Chongwu, Li Xinyue. The Analysis of City Tourism Portrait Evolution Based on Multi-Dimensional Features and LDA Model [J]. 数据分析与知识发现, 0, (): 1-.
[4] Liu Jingru, Song Yang, Jia Rui, Zhang Yipeng, Luo Yong, Ma Jingdong. A BiLSTM-CRF Model for Chinese Clinical Protected Health Information Recognition [J]. 数据分析与知识发现, 0, (): 0-.
[5] Shi Lei,Wang Yi,Cheng Ying,Wei Ruibin. Review of Attention Mechanism in Natural Language Processing[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[6] Liu Ping,Peng Xiaofang. Calculating Word Similarities Based on Formal Concept Analysis[J]. 数据分析与知识发现, 2020, 4(5): 66-74.
[7] Liu Shurui,Tian Jidong,Chen Puchun,Lai Li,Song Guojie. New Sample Selection Algorithm with Textual Data[J]. 数据分析与知识发现, 2020, 4(2/3): 223-230.
[8] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
[9] Ying Tan,Jin Zhang,Lixin Xia. A Survey of Sentiment Analysis on Social Media[J]. 数据分析与知识发现, 2020, 4(1): 1-11.
[10] Hui Nie,Huan He. Identifying Implicit Features with Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
[11] Bocheng Li,Yunqiu Zhang,Kaixi Yang. Extracting Emotion Tags from Comments of Microblog Commodities[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[12] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[13] Yu Chuanming,Gong Yutian,Wang Feng,An Lu. Predicting Stock Prices with Text and Price Combined Model[J]. 数据分析与知识发现, 2018, 2(12): 33-42.
[14] Zeng Ziming,Yang Qianwen. Sentiment Analysis for Micro-blogs with LDA and AdaBoost[J]. 数据分析与知识发现, 2018, 2(8): 51-59.
[15] Jia Longjia,Zhang Bangzuo. Classifying Topics of Internet Public Opinion from College Students: Case Study of Sina Weibo[J]. 数据分析与知识发现, 2018, 2(7): 55-62.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn