1School of Management, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China 2School of Management, Harbin Institute of Technology, Harbin 150006, China 3Shenzhen Ward Intellectual Property Agency, Shenzhen 518000, China 4Shenzhen Yingfeng Intellectual Property Consulting Co., Ltd, Shenzhen 518000, China
[Objective] This paper addresses the issues of the traditional single classification method, which cannot effectively identify high-quality “bottleneck” technology patents. [Methods] We developed a multi-category polling model (LSTM-Seq-BERT) with LSTM, Word2Vec, and BERT to identify high-quality “bottleneck” patents from the application documents. Moreover, we constructed a corresponding multi-level label system for the model with IPC number as the primary classification labels and authorization status as the secondary classification labels. [Results] The accuracy of identifying high-quality “bottleneck” technology patents was increased to 88.1%. [Limitations] We only utilized patents from the Hongkong-Macau-Guangdong Greater Bay Area, resulting in data imbalance. [Conclusions] The proposed model can enhance the accuracy of identifying high-quality “bottleneck” technology patents and possesses practical value.
赵雪峰, 吴德林, 吴伟伟, 孙卓荦, 胡瑾瑾, 廉莹, 单佳宇. 基于深度学习与多分类轮询机制的高质量“卡脖子”技术专利识别模型——以专利申请文件为研究主体*[J]. 数据分析与知识发现, 2023, 7(8): 30-45.
Zhao Xuefeng, Wu Delin, Wu Weiwei, Sun Zhuoluo, Hu Jinjin, Lian Ying, Shan Jiayu. Identifying High-Quality Technology Patents Based on Deep Learning and Multi-Category Polling Mechanism——Case Study of Patent Applications. Data Analysis and Knowledge Discovery, 2023, 7(8): 30-45.
(Wang Dongjing. Theoretical Logic and Practical Logic of the Formation and Development of Basic Economic System Since the Founding of New China[J]. Journal of Management World, 2022, 38(3): 1-8.)
(Song Lifeng, Ou Yuxian, Wang Jing, et al. Research on the Breakthrough Mechanism of “Stuck Necking” Technology Based on Major Scientific and Technological Projects[J]. Studies in Science of Science, 2022, 40(11): 1991-2000.)
(Ma Lanmeng, Yuan Fei, Li Long. Probing into the Research Mode of Information Science on the Problem of “Bottleneck”: Taking the Field of Chip Lithography as an Example[J]. Science and Technology Management Research, 2022, 42(2): 225-234.)
(Chen Jin, Yang Zhen, Zhu Ziqin. The Solution of “Neck Sticking” Technology During the 14th Five-Year Plan Period: Identification Framework, Strategic Change and Breakthrough Path[J]. Reform, 2020(12): 5-15.)
[7]
Wang J J, Ye F Y. Probing into the Interactions Between Papers and Patents of New CRISPR/CAS9 Technology: A Citation Comparison[J]. Journal of Informetrics, 2021, 15(4): 101189.
doi: 10.1016/j.joi.2021.101189
[8]
Kuhn J M, Teodorescu M H M. The Track One Pilot Program: Who Benefits from Prioritized Patent Examination?[J]. Strategic Entrepreneurship Journal, 2021, 15(2): 185-208.
doi: 10.1002/sej.v15.2
[9]
Feng J, Jaravel X. Crafting Intellectual Property Rights: Implications for Patent Assertion Entities, Litigation, and Innovation[J]. American Economic Journal: Applied Economics, 2020, 12(1): 140-181.
Zhang Yuzhe, Wang Haicheng, Yang Wei, et al. The Main Problems Faced by China’s Key Core Technology Attack and Suggestions for Countermeasure(Written Conversation)[J]. Macroeconomics, 2021(10): 75-116.)
[11]
Wu H C, Chen H Y, Lee K Y. Unveiling the Core Technology Structure for Companies Through Patent Information[J]. Technological Forecasting and Social Change, 2010, 77(7): 1167-1178.
doi: 10.1016/j.techfore.2010.03.013
[12]
Ljungberg D, Bourelos E, McKelvey M. Academic Inventors, Technological Profiles and Patent Value: An Analysis of Academic Patents Owned by Swedish-Based Firms[J]. Industry and Innovation, 2013, 20(5): 473-487.
doi: 10.1080/13662716.2013.824193
[13]
Lai K, Chen H C, Chang Y H, et al. A Structured MPA Approach to Explore Technological Core Competence, Knowledge Flow, and Technology Development Through Social Network Patentometrics[J]. Journal of Knowledge Management, 2020, 25(2): 402-432.
doi: 10.1108/JKM-01-2020-0037
[14]
Trappey A J C, Trappey C V, Wu J L, et al. Intelligent Compilation of Patent Summaries Using Machine Learning and Natural Language Processing Techniques[J]. Advanced Engineering Informatics, 2020, 43: 101027.
doi: 10.1016/j.aei.2019.101027
(Tong Xinyu, Zhao Ruijie, Lu Yonghe. Multi-Label Patent Classification with Pre-Training Model[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 129-137.)
(Xiao Yuejun, Li Honglian, Zhang Le, et al. Classifying Chinese Patent Texts with Feature Fusion[J]. Data Analysis and Knowledge Discovery, 2022, 6(4): 49-59.)
(Guan Peng, Wang Yuefen, Fu Zhu, et al. Identifying R & D Teams and Innovations with Patent Collaboration Networks[J]. Data Analysis and Knowledge Discovery, 2022, 6(5): 99-111.)
[18]
Nagler M, Sorg S. The Disciplinary Effect of Post-Grant Review—Causal Evidence from European Patent Opposition[J]. Research Policy, 2020, 49(3): 103915.
doi: 10.1016/j.respol.2019.103915
(Liu Dayong, Meng Qiaoran, Duan Wenbin. The Impact of Scientific and Technological Achievements Transformation on the Cultivation of New Economic Driving Force: Evidence from 230 Cities in China[J]. Journal of Management Sciences in China, 2021, 24(7): 49-65.)
[24]
Liu W D, Qiao W B, Wang Y, et al. Patent Transformation Opportunity to Realize Patent Value: Discussion About the Conditions to be Used or Exchanged[J]. Information Processing & Management, 2021, 58(4): 102582.
doi: 10.1016/j.ipm.2021.102582
[25]
Trappey A J C, Trappey C V, Govindarajan U H, et al. Patent Value Analysis Using Deep Learning Models—The Case of IoT Technology Mining for the Manufacturing Industry[J]. IEEE Transactions on Engineering Management, 2021, 68(5): 1334-1346.
doi: 10.1109/TEM.2019.2957842
[26]
Wang J L, Fan Y, Zhang H, et al. Technology Hotspot Tracking: Topic Discovery and Evolution of China’s Blockchain Patents Based on a Dynamic LDA Model[J]. Symmetry, 2021, 13(3): 415.
doi: 10.3390/sym13030415
(Yuan Bo, Liu Wenxing, Zhang Pengcheng. The Influence of the Protection Ability of Intellectual Property on Large Scientific Project Technology Innovation: A Contingent Model[J]. Systems Engineering-Theory & Practice, 2014, 34(11): 2965-2973.)
doi: 10.12011/1000-6788(2014)11-2965
[28]
Huang Z X, Xie Z P. A Patent Keywords Extraction Method Using TextRank Model with Prior Public Knowledge[J]. Complex & Intelligent Systems, 2022, 8(1): 1-12.
[29]
Chung P, Sohn S Y. Early Detection of Valuable Patents Using a Deep Learning Model: Case of Semiconductor Industry[J]. Technological Forecasting and Social Change, 2020, 158: 120146.
doi: 10.1016/j.techfore.2020.120146
[30]
Zhu H M, He C H, Fang Y, et al. Patent Automatic Classification Based on Symmetric Hierarchical Convolution Neural Network[J]. Symmetry, 2020, 12(2): 186.
doi: 10.3390/sym12020186
[31]
Wu H Q, Shen G Q, Lin X, et al. A Transformer-Based Deep Learning Model for Recognizing Communication-Oriented Entities from Patents of ICT in Construction[J]. Automation in Construction, 2021, 125: 103608.
doi: 10.1016/j.autcon.2021.103608
[32]
Ni X, Samet A, Cavallucci D. Similarity-Based Approach for Inventive Design Solutions Assistance[J]. Journal of Intelligent Manufacturing, 2022, 33(6): 1681-1698.
doi: 10.1007/s10845-021-01749-4
[33]
Sherstinsky A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network[J]. Physica D: Nonlinear Phenomena, 2020, 404: 132306.
doi: 10.1016/j.physd.2019.132306
[34]
Nguyen H D, Tran K P, Thomassey S, et al. Forecasting and Anomaly Detection Approaches Using LSTM and LSTM Autoencoder Techniques with the Applications in Supply Chain Management[J]. International Journal of Information Management, 2021, 57: 102282.
doi: 10.1016/j.ijinfomgt.2020.102282
[35]
Di Gennaro G, Buonanno A, Palmieri F A N. Considerations About Learning Word2Vec[J]. The Journal of Supercomputing, 2021, 77(11): 12320-12335.
doi: 10.1007/s11227-021-03743-2
[36]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[37]
Wan C X, Li B. Financial Causal Sentence Recognition Based on BERT-CNN Text Classification[J]. The Journal of Supercomputing, 2022, 78(5): 6503-6527.
doi: 10.1007/s11227-021-04097-5
[38]
Sun F, Liu J, Wu J, et al. BERT4Rec:Sequential Recommendation with Bidirectional Encoder Representations from Transformer[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York: ACM, 2019: 1441-1450.
[39]
Patil A, Viquerat J, Larcher A, et al. Robust Deep Learning for Emulating Turbulent Viscosities[J]. Physics of Fluids, 2021, 33(10): 105118.
doi: 10.1063/5.0064458
[40]
Zaki G, Gudla P R, Lee K, et al. A Deep Learning Pipeline for Nucleus Segmentation[J]. Cytometry Part A, 2020, 97(12): 1248-1264.
doi: 10.1002/cyto.a.24257
pmid: 33141508
[41]
Wang X, Wang K, Lian S G. A Survey on Face Data Augmentation for the Training of Deep Neural Networks[J]. Neural Computing and Applications, 2020, 32(19): 15503-15531.
doi: 10.1007/s00521-020-04748-3
[42]
Moon T, Son J E. Knowledge Transfer for Adapting Pre-Trained Deep Neural Models to Predict Different Greenhouse Environments Based on a Low Quantity of Data[J]. Computers and Electronics in Agriculture, 2021, 185: 106136.
doi: 10.1016/j.compag.2021.106136
(Wang Ling, Li Wenchang, Zhao Meng. Research on Influencing Factors of Ineffective Patent of Different Types of Patentees[J]. Science and Technology Management Research, 2021, 41(19): 149-154.)
[44]
Krestel R, Chikkamath R, Hewel C, et al. A Survey on Deep Learning for Patent Analysis[J]. World Patent Information, 2021, 65: 102035.
doi: 10.1016/j.wpi.2021.102035
[45]
Zimmer L, Lindauer M, Hutter F. Auto-PyTorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(9): 3079-3090.
doi: 10.1109/TPAMI.2021.3067763
[46]
Paszke A, Gross S, Massa F, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 32: 8026-8037.