|
|
Extracting Patent Keywords by Integrating Restriction Relationship |
Yu Yan( ),Zhu Shengchen |
Institute of the Information Management and Technology, Nanjing Tech University, Nanjing 210009, China |
|
|
Abstract [Objective] This paper tries to improve the accuracy of patent keyword extraction with the characteristics of patent claims. [Methods] We examined the restriction relationship between technical features of patent claims. Then, we integrated these relationship into the patent keyword extraction method based on graph. [Results] We examined our model with the USPTO and Baiten data sets for patents. The MRR index of our method was 31.79% (USPTO) and 33.81% (Baiten) higher than the traditional TextRank method. [Limitations] The data of our experimental analysis need to be further expanded. [Conclusions] The proposed method could significantly improve the accuracy of patent keyword extraction.
|
Received: 27 December 2021
Published: 16 November 2022
|
|
Fund:National Social Science Fund of China(17BTQ059) |
Corresponding Authors:
Yu Yan,ORCID:0000-0002-9654-8614
E-mail: yuyanyuyan2004@126.com
|
[1] |
Mihalcea R, Tarau P. TextRank: Bringing Order into Texts[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004: 404-411.
|
[2] |
Wan X J, Xiao J G. Single Document Keyphrase Extraction Using Neighborhood Knowledge[C]// Proceedings of the 23rd National Conference on Artificial Intelligence. 2008: 855-860.
|
[3] |
夏天. 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013(9): 30-34.
|
[3] |
(Xia Tian. Study on Keyword Extraction Using Word Position Weighted TextRank[J]. New Technology of Library and Information Service, 2013(9): 30-34.)
|
[4] |
Florescu C, Caragea C. PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1105-1115.
|
[5] |
李航, 唐超兰, 杨贤, 等. 融合多特征的TextRank关键词抽取方法[J]. 情报杂志, 2017, 36(8): 183-187.
|
[5] |
(Li Hang, Tang Chaolan, Yang Xian, et al. TextRank Keyword Extraction Based on Multi Feature Fusion[J]. Journal of Intelligence, 2017, 36(8): 183-187.)
|
[6] |
刘竹辰, 陈浩, 于艳华, 等. 词位置分布加权TextRank的关键词提取[J]. 数据分析与知识发现, 2018, 2(9): 74-79.
|
[6] |
(Liu Zhuchen, Chen Hao, Yu Yanhua, et al. Extracting Keywords with TextRank and Weighted Word Positions[J]. Data Analysis and Knowledge Discovery, 2018, 2(9): 74-79.)
|
[7] |
Boudin F. Unsupervised Keyphrase Extraction with Multipartite Graphs[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2018: 667-672.
|
[8] |
顾益军, 夏天. 融合LDA与TextRank的关键词抽取研究[J]. 现代图书情报技术, 2014(7): 41-47.
|
[8] |
(Gu Yijun, Xia Tian. Study on Keyword Extraction with LDA and TextRank Combination[J]. New Technology of Library and Information Service, 2014(7): 41-47.)
|
[9] |
刘啸剑, 谢飞, 吴信东. 基于图和LDA主题模型的关键词抽取算法[J]. 情报学报, 2016, 35(6): 664-672.
|
[9] |
(Liu Xiaojian, Xie Fei, Wu Xindong. Graph Based Keyphrase Extraction Using LDA Topic Model[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(6): 664-672.)
|
[10] |
夏天. 词向量聚类加权TextRank的关键词抽取[J]. 数据分析与知识发现, 2017, 1(2): 28-34.
|
[10] |
(Xia Tian. Extracting Keywords with Modified TextRank Model[J]. Data Analysis and Knowledge Discovery, 2017, 1(2): 28-34.)
|
[11] |
宁建飞, 刘降珍. 融合Word2vec与TextRank的关键词抽取研究[J]. 现代图书情报技术, 2016(6): 20-27.
|
[11] |
(Ning Jianfei, Liu Jiangzhen. Using Word2vec with TextRank to Extract Keywords[J]. New Technology of Library and Information Service, 2016(6): 20-27.)
|
[12] |
Wang R, Liu W, McDonald C. Using Word Embeddings to Enhance Keyword Identification for Scientific Publications [A]// Databases Theory and Applications[M]. Springer, Cham. 2015.
|
[13] |
俞琰, 尚明杰, 赵乃瑄. 权利要求特征驱动的专利关键词抽取方法[J]. 情报学报, 2021, 40(6):610-620.
|
[13] |
(Yu Yan, Shang Mingjie, Zhao Naixuan. Patent Keyword Extraction Driven by Claim Features[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(6): 610-620.)
|
[14] |
Witten I H, Paynter G W, Frank E, et al. KEA: Practical Automatic Keyphrase Extraction[C]// Proceedings of the 4th ACM Conference on Digital Libraries. 1999: 254-255.
|
[15] |
Zhang K, Xu H, Tang J, et al. Keyword Extraction Using Support Vector Machine[C]// Proceedings of the 7th International Conference on Advances in Web-Age Information Management. 2006: 85-96.
|
[16] |
陈忆群, 周如旗, 朱蔚恒, 等. 挖掘专利知识实现关键词自动抽取[J]. 计算机研究与发展, 2016, 53(8): 1740-1752.
|
[16] |
(Chen Yiqun, Zhou Ruqi, Zhu Weiheng, et al. Mining Patent Knowledge for Automatic Keyword Extraction[J]. Journal of Computer Research and Development, 2016, 53(8): 1740-1752.)
|
[17] |
Hu J, Li S B, Yao Y, et al. Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification[J]. Entropy (Basel, Switzerland), 2018, 20(2): Ariticle No.104.
|
[18] |
Zhang C, Wang H, Liu Y, et al. Automatic Keyword Extraction from Documents Using Conditional Random Fields[J]. Journal of Computer Information Systems, 2008, 4(3): 1169-1180.
|
[19] |
Gollapalli S D, Li X L, Yang P. Incorporating Expert Knowledge into Keyphrase Extraction[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3180-3187.
|
[20] |
成彬, 施水才, 都云程, 等. 基于融合词性的BiLSTM-CRF的期刊关键词抽取方法[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
|
[20] |
(Cheng Bin, Shi Shuicai, Du Yuncheng, et al. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. Data Analysis and Knowledge Discovery, 2021, 5(3): 101-108.)
|
[21] |
陈伟, 吴友政, 陈文亮, 等. 基于BiLSTM-CRF的关键词自动抽取[J]. 计算机科学, 2018, 45(S): 91-113.
|
[21] |
(Chen Wei, Wu Youzheng, Chen Wenliang, et al. Automatic Keyword Extraction Based on BiLSTM-CRF[J]. Computer Science, 2018, 45(S): 91-113.)
|
[22] |
Sterckx L, Demeester T, Deleu J, et al. Creation and Evaluation of Large Keyphrase Extraction Collections with Multiple Opinions[J]. Language Resources and Evaluation, 2018, 52(2): 503-532.
doi: 10.1007/s10579-017-9395-6
|
[23] |
Wang L, Li F. SJTULTLAB: Chunk Based Method for Keyphrase Extraction[C]// Proceedings of the 5th International Workshop on Semantic Evaluation. 2010: 158-161.
|
[24] |
刘峰, 吴瑞红, 徐川, 等. 专利文献中关键词抽取方法的改进[J]. 情报杂志, 2014, 33(12): 36-40.
|
[24] |
(Liu Feng, Wu Ruihong, Xu Chuan, et al. Keyword Extraction of Patent Document: An Improved Approach[J]. Journal of Intelligence, 2014, 33(12): 36-40.)
|
[25] |
黄磊, 伍雁鹏, 朱群峰. 关键词自动提取方法的研究与改进[J]. 计算机科学, 2014, 41(6): 204-207.
doi: 10.11896/j.issn.1002-137X.2014.06.040
|
[25] |
(Huang Lei, Wu Yanpeng, Zhu Qunfeng. Research and Improvement of TFIDF Text Feature Weighting Method[J]. Computer Science, 2014, 41(6): 204-207.)
doi: 10.11896/j.issn.1002-137X.2014.06.040
|
[26] |
张瑾. 基于改进TF-IDF算法的情报关键词提取方法[J]. 情报杂志, 2014, 33(4): 153-155.
|
[26] |
(Zhang Jin. A Method of Intelligence Key Words Extraction Based on Improved TF-IDF[J]. Journal of Intelligence, 2014, 33(4): 153-155.)
|
[27] |
牛萍, 黄德根. TF-IDF与规则相结合的中文关键词自动抽取研究[J]. 小型微型计算机系统, 2016, 37(4): 711-715.
|
[27] |
(Niu Ping, Huang Degen. TF-IDF and Rules Based Automatic Extraction of Chinese Keywords[J]. Journal of Chinese Computer Systems, 2016, 37(4): 711-715.)
|
[28] |
Joung J, Kim K. Monitoring Emerging Technologies for Technology Planning Using Technical Keyword Based Analysis from Patent Data[J]. Technological Forecasting and Social Change, 2017, 114: 281-292.
doi: 10.1016/j.techfore.2016.08.020
|
[29] |
Brin S, Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7): 107-117.
doi: 10.1016/S0169-7552(98)00110-X
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|