Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (3): 26-32    DOI: 10.11925/infotech.1003-3513.2015.03.04
Current Issue | Archive | Adv Search |
Patent Keyword Indexing Based on Weighted Complex Graph Model
Li Junfeng, Lv Xueqiang, Zhou Shaojun
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science & Technology University, Beijing 100101, China
Download: PDF(477 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Patent keyword indexing plays an important role in nature language processing and is widely applied in many fields, such as patent retrieval, translation and automatic summary. [Methods] Using K-proximity coupled graph to transfer patents into complex graph model, and average connectivity weight is proposed with the average path variation, the average clustering coefficient, and the current node's liquidity effect. Considering the location information, the word-gap information and the inverse document frequency of keywords, a patent comprehensive correlation calculation method for quantitative analysis of keyword importance is proposed. [Results] Experiment of patent literatures in sensor domain obtains the precision of 60.9% on top-8, and the recall rate of 73.4% on top-10. [Limitations] The result of keywords with low frequency is not good enough, which affects the indexing result. [Conclusions] Experimental results show that this method is effective and has active significance for patent indexing.

Key wordsComplex graph model      Topology potential      Keyword indexing      Average connectivity weight      Comprehensive correlation     
Received: 13 August 2014      Published: 16 April 2015
:  TP391.1  

Cite this article:

Li Junfeng, Lv Xueqiang, Zhou Shaojun. Patent Keyword Indexing Based on Weighted Complex Graph Model. New Technology of Library and Information Service, 2015, 31(3): 26-32.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.03.04     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I3/26

[1] 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009, 29(4): 221-225. (Zhang Jing. Review and Prospect of Automatic Indexing [J]. Journal of Modern Information, 2009, 29(4): 221-225.)
[2] Fujii A, Utiyama M, Yamamoto M, et al. Overview of the Patent Translation Task at the NTCIR-7 Workshop [C]. In: Proceedings of the 7th NII Testbeds and Community for Information Access Research Workshop Meeting, Tokyo, Japan. Tokyo: National Institude of Informatics, 2008: 389-400.
[3] Wartena C, Brussee R, Slakhorst W. Keyword Extraction Using Word Co-occurrence[C]. In: Proceedings of 2010 Workshop on Database and Expert Systems Applications (DEXA), Bilbao, Spain. New York, USA: IEEE, 2010: 54-58.
[4] 罗准辰, 王挺. 基于分离模型的中文关键词提取算法研究[J]. 中文信息学报, 2009, 23(1): 63-70. (Luo Zhunchen, Wang Ting. Research on the Chinese Keyword Extraction Algorithm Based on Separate Models [J]. Journal of Chinese Information Processing, 2009, 23(1): 63-70.)
[5] 索红光, 刘玉树, 曹淑英. 一种基于词汇链的关键词抽取方法[J]. 中文信息学报, 2006, 20(6): 25-30. (Suo Hongguang, Liu Yushu, Cao Shuying. A Keyword Selection Method Based on Lexical Chains [J]. Journal of Chinese Information Processing, 2006, 20(6): 25-30.)
[6] Noh Y, Son J W, Park S B. Keyword Extraction from Dialogue Sentences Using Semantic and Topical Relatedness [C]. In: Proceedings of the 20th International Conference on Neural Information Processing, Daegu, Korea. Berlin: Springer-Verlag, 2013: 129-136.
[7] 章成志. 基于集成学习的自动标引方法研究[J]. 中国索引, 2009, 7(2): 16-23. (Zhang Chengzhi. Automatic Indexing Method Based on Ensemble Learning [J]. Journal of the China Society of Indexers, 2009, 7(2): 16-23.)
[8] Chen X, Peng Z, Zeng C. A Co-training Based Method for Chinese Patent Semantic Annotation[C]. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2012: 2379-2382.
[9] 马力, 焦李成, 白琳, 等. 基于小世界模型的复合关键词提取方法研究[J]. 中文信息学报, 2009, 23(3): 121-128. (Ma Li, Jiao Licheng, Bai Lin, et al. Research on a Compound Keywords Detection Method Based on Small World Model [J]. Journal of Chinese Information Processing, 2009, 23(3): 121-128.)
[10] 翟周伟, 刘刚, 吕玉琴. 基于图模型的关键词挖掘方法[J]. 软件, 2012, 33(8): 9-13. (Zhai Zhouwei, Liu Gang, Lv Yuqin. Keywords Mining Method Based on Graph Model [J]. Software, 2012, 33(8): 9-13.)
[11] 夏天. 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013(9): 30-34. (Xia Tian. Study on Keyword Extraction Using Word Position Weight TextRank [J]. New Technology of Library and Information Service, 2013(9): 30-34.)
[12] Wang S, Hauskrecht M. Keyword Annotationof Biomedical Documents with Graph-based Similarity Methods [C]. In:Proceedings of the 2012 IEEE International Conference on BioInformatics and BioMedicine (BIBM), Philadelphia, PA, USA. IEEE, 2012: 1-4.
[13] 于少然. 网络拓扑结构中节点重要性评价方法的研究[D]. 北京: 北京交通大学, 2012. (Yu Shaoran. The Research of Node Importance Analysis in the Networks Topology [D]. Beijing: Beijing Jiaotong University, 2012.)
[14] Yang Y, Zhao T, Lu Q, et al. Chinese Term Extraction Using Different Types of Relevance [C]. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Suntec, Singapore. Philadelphia, PA, USA: Association for Computational Linguistics, 2009: 213-216.
[15] Ventura J A L, Jonquet C, Roche M, et al. Combining C-value and Keyword Extraction Methods for Biomedical Terms Ex­traction [C]. In: Proceedings of the 5th International Sy­m­posium on Languages in Biology and Medicine, Tokyo, Japan. Database Center for Life Science Technology, 2013: 45-49.

[1] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[2] Xu Deshan, Li Hui, Zhang Yunliang. A Method of Keywords Annotation Based on Linked Triples[J]. 现代图书情报技术, 2015, 31(9): 31-37.
[3] Chen Shiqin, Li Wenjiang. Application of WebSocket in Library Mobile Information Service[J]. 现代图书情报技术, 2015, 31(9): 90-96.
[4] Hu Juxiang, Lv Xueqiang, Liu Kehui. Complaint Text Classification Based on Guiding Words[J]. 现代图书情报技术, 2015, 31(7-8): 97-103.
[5] Duan Yufeng, Zhu Wenjing, Chen Qiao, Liu Wei, Liu Fenghong. The Study on Out-of-Vocabulary Identification on a Model Based on the Combination of CRFs and Domain Ontology Elements Set[J]. 现代图书情报技术, 2015, 31(4): 41-49.
[6] Ma Bin, Yin Lifeng. A Parallel Naive Bayesian Network Public Opinion Fast Classification Algorithm Based on Hadoop Platform[J]. 现代图书情报技术, 2015, 31(2): 78-84.
[7] Hou Ting, Lv Xueqiang, Li Zhuo. Hierarchical Filtering Method for Patent Term Extraction[J]. 现代图书情报技术, 2015, 31(1): 24-30.
[8] Tang Shouli, Xu Baoxiang. Research on Ontology-based Cloud Services Semantic Retrieval System[J]. 现代图书情报技术, 2014, 30(12): 27-35.
[9] Tang Xiaobo, Xiao Lu. Research of Text Feature Extraction on Dependency Parsing Network[J]. 现代图书情报技术, 2014, 30(11): 31-37.
[10] Shi Cui, Wang Yang, Yang Bin, Yao Ye. Identification of Non-nest Coordination for Chinese Patent Literature[J]. 现代图书情报技术, 2014, 30(10): 76-83.
[11] Zhang Yongjun, Liu Jinling, Ma Jialin. Classification of Multi Topic Extraction Based on Chinese Short Information Text Message Flow[J]. 现代图书情报技术, 2014, 30(7): 101-106.
[12] Li Wenjiang, Chen Shiqin. WeChat as Library Public Service Platform for the APP Client[J]. 现代图书情报技术, 2014, 30(7): 133-138.
[13] Tang Qing,Lv Xueqiang,Li Zhuo,Shi Shuicai,. Research on Domain Ontology Term Extraction[J]. 现代图书情报技术, 2014, 30(1): 43-50.
[14] Li Wenjiang, Chen Shiqin. Design of Library Information Push System Based on Android GCM Service[J]. 现代图书情报技术, 2013, 29(11): 91-96.
[15] Xiong Liyan, Tan Long, Zhong Maosheng. An Automatic Term Extraction System of Improved C-value Based on Effective Word Frequency[J]. 现代图书情报技术, 2013, 29(9): 54-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn