|
|
Patent Keyword Indexing Based on Weighted Complex Graph Model |
Li Junfeng, Lv Xueqiang, Zhou Shaojun |
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science & Technology University, Beijing 100101, China |
|
|
Abstract [Objective] Patent keyword indexing plays an important role in nature language processing and is widely applied in many fields, such as patent retrieval, translation and automatic summary. [Methods] Using K-proximity coupled graph to transfer patents into complex graph model, and average connectivity weight is proposed with the average path variation, the average clustering coefficient, and the current node's liquidity effect. Considering the location information, the word-gap information and the inverse document frequency of keywords, a patent comprehensive correlation calculation method for quantitative analysis of keyword importance is proposed. [Results] Experiment of patent literatures in sensor domain obtains the precision of 60.9% on top-8, and the recall rate of 73.4% on top-10. [Limitations] The result of keywords with low frequency is not good enough, which affects the indexing result. [Conclusions] Experimental results show that this method is effective and has active significance for patent indexing.
|
Received: 13 August 2014
Published: 16 April 2015
|
|
[1] 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009, 29(4): 221-225. (Zhang Jing. Review and Prospect of Automatic Indexing [J]. Journal of Modern Information, 2009, 29(4): 221-225.)
[2] Fujii A, Utiyama M, Yamamoto M, et al. Overview of the Patent Translation Task at the NTCIR-7 Workshop [C]. In: Proceedings of the 7th NII Testbeds and Community for Information Access Research Workshop Meeting, Tokyo, Japan. Tokyo: National Institude of Informatics, 2008: 389-400.
[3] Wartena C, Brussee R, Slakhorst W. Keyword Extraction Using Word Co-occurrence[C]. In: Proceedings of 2010 Workshop on Database and Expert Systems Applications (DEXA), Bilbao, Spain. New York, USA: IEEE, 2010: 54-58.
[4] 罗准辰, 王挺. 基于分离模型的中文关键词提取算法研究[J]. 中文信息学报, 2009, 23(1): 63-70. (Luo Zhunchen, Wang Ting. Research on the Chinese Keyword Extraction Algorithm Based on Separate Models [J]. Journal of Chinese Information Processing, 2009, 23(1): 63-70.)
[5] 索红光, 刘玉树, 曹淑英. 一种基于词汇链的关键词抽取方法[J]. 中文信息学报, 2006, 20(6): 25-30. (Suo Hongguang, Liu Yushu, Cao Shuying. A Keyword Selection Method Based on Lexical Chains [J]. Journal of Chinese Information Processing, 2006, 20(6): 25-30.)
[6] Noh Y, Son J W, Park S B. Keyword Extraction from Dialogue Sentences Using Semantic and Topical Relatedness [C]. In: Proceedings of the 20th International Conference on Neural Information Processing, Daegu, Korea. Berlin: Springer-Verlag, 2013: 129-136.
[7] 章成志. 基于集成学习的自动标引方法研究[J]. 中国索引, 2009, 7(2): 16-23. (Zhang Chengzhi. Automatic Indexing Method Based on Ensemble Learning [J]. Journal of the China Society of Indexers, 2009, 7(2): 16-23.)
[8] Chen X, Peng Z, Zeng C. A Co-training Based Method for Chinese Patent Semantic Annotation[C]. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2012: 2379-2382.
[9] 马力, 焦李成, 白琳, 等. 基于小世界模型的复合关键词提取方法研究[J]. 中文信息学报, 2009, 23(3): 121-128. (Ma Li, Jiao Licheng, Bai Lin, et al. Research on a Compound Keywords Detection Method Based on Small World Model [J]. Journal of Chinese Information Processing, 2009, 23(3): 121-128.)
[10] 翟周伟, 刘刚, 吕玉琴. 基于图模型的关键词挖掘方法[J]. 软件, 2012, 33(8): 9-13. (Zhai Zhouwei, Liu Gang, Lv Yuqin. Keywords Mining Method Based on Graph Model [J]. Software, 2012, 33(8): 9-13.)
[11] 夏天. 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013(9): 30-34. (Xia Tian. Study on Keyword Extraction Using Word Position Weight TextRank [J]. New Technology of Library and Information Service, 2013(9): 30-34.)
[12] Wang S, Hauskrecht M. Keyword Annotationof Biomedical Documents with Graph-based Similarity Methods [C]. In:Proceedings of the 2012 IEEE International Conference on BioInformatics and BioMedicine (BIBM), Philadelphia, PA, USA. IEEE, 2012: 1-4.
[13] 于少然. 网络拓扑结构中节点重要性评价方法的研究[D]. 北京: 北京交通大学, 2012. (Yu Shaoran. The Research of Node Importance Analysis in the Networks Topology [D]. Beijing: Beijing Jiaotong University, 2012.)
[14] Yang Y, Zhao T, Lu Q, et al. Chinese Term Extraction Using Different Types of Relevance [C]. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Suntec, Singapore. Philadelphia, PA, USA: Association for Computational Linguistics, 2009: 213-216.
[15] Ventura J A L, Jonquet C, Roche M, et al. Combining C-value and Keyword Extraction Methods for Biomedical Terms Extraction [C]. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan. Database Center for Life Science Technology, 2013: 45-49. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|