Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (12): 40-51    DOI: 10.11925/infotech.2096-3467.2023.0219
Current Issue | Archive | Adv Search |
Detecting Inventors of Breakthrough Innovation Based on Dynamic Learning of Patent Knowledge Graph
Yu Bowen,Liu Xiang()
School of Information Management, Central China Normal University, Wuhan 430079, China
Download: PDF (1397 KB)   HTML ( 10
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to identify breakthrough innovation inventors through their collaboration and citation features. [Methods] First, we defined the metrics of breakthrough innovation inventors. Then, we examined the features of cooperation and citation relationship of inventors. Third, we established a statistical learning model to predict their future innovations based on the dynamic learning of the patent knowledge graph. Finally, we analyzed the characteristics of breakthrough innovation inventors. [Results] We examined our model with patent data and found its overall prediction accuracy reached 83.51%. The model’s accuracy for predicting breakthrough and continuation innovation inventors reached 85.99% and 81.40%, respectively. While predicting the inventors, their collaboration and citation-related features were ranked high. [Limitations] The ambiguity of the technological innovation metric of patents around the value of 0 was not fully resolved. We filtered inventors with unidentified categories due to this problem. [Conclusions] The proposed model could discover breakthrough innovation inventors earlier.

Key wordsPatent Analysis      Citation Network      Cooperative Network      Breakthrough Innovation      Statistical Learning     
Received: 17 March 2023      Published: 08 January 2024
ZTFLH:  G305  
  TP181  
Fund:National Natural Science Foundation of China(71673106)
Corresponding Authors: Liu Xiang,ORCID:0000-0003-4315-2699,E-mail:xiangliu@ccnu.edu.cn。   

Cite this article:

Yu Bowen, Liu Xiang. Detecting Inventors of Breakthrough Innovation Based on Dynamic Learning of Patent Knowledge Graph. Data Analysis and Knowledge Discovery, 2023, 7(12): 40-51.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2023.0219     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I12/40

Patent Knowledge Graph
A Time Sliced Example of Dynamic Patent Citation Network
Algorithm Workflow
特征 字段 描述
发明人合作关系特征 StgCoopsNum 发明人合作者总数
StgAvgTeamSize 平均合作规模
StgAvgCoopsCitedNum 合作者平均被引量
StgSumCoopsCitedNum 合作者的被引量之和
发明人引用关系特征 StgPNum 发明人的专利数量
StgSumPCitedNum 发明人公开专利的被引量之和
StgMaxPCitedNum 发明人公开专利的最高被引量
StgAvgPCitedNum 发明人公开专利的平均被引量
StgSumPCitedRank 发明人公开专利的被引量排名
StgAvgPRefNum 发明人公开专利的参考专利数目之和
TotSumPCitedNum 发明人公开所有专利的总被引量
TotHindex 发明人的H指数
发明人网络结构特征 CoopNWCloseness 发明人合作网络节点接近中心性
CoopNWClustering 发明人合作网络节点聚类系数
CoopNWPR 发明人合作网络节点PageRank值
CiteNWClustering 发明人引用网络节点聚类系数
CiteNWPR 发明人引用网络PageRank值
Features of the Model
模型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
逻辑回归 69.99 69.87 71.57 67.83 67.84 69.76
非线性SVM 70.15 70.03 71.66 69.98% 69.99 71.48
随机森林 83.51 83.47 83.69 78.57 78.57 78.73
Experimental Results for Two Classification Methods
创新类型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
突破式 85.99 79.71 82.73 80.82 74.97 77.78
延续式 81.40 87.24 84.22 76.63 82.19 79.31
Random Forest Performance
特征 重要性
评分
特征 重要性
评分
CoopNWPR 0.107 9 StgSumCoopsCitedNum 0.061 2
CiteNWPR 0.104 6 StgAvgTeamSize 0.052 5
StgSumPCitedRank 0.085 5 TotSumPCitedNum 0.047 8
StgAvgPRefNum 0.079 1 StgCoopsNum 0.038 7
StgMaxPCitedNum 0.075 7 CiteNWClustering 0.029 4
StgSumPCitedNum 0.073 0 TotHindex 0.021 3
CoopNWCloseness 0.071 9 StgPNum 0.015 1
StgAvgPCitedNum 0.068 9 CoopNWClustering 0.001 2
StgAvgCoopsCitedNum 0.066 0
Importance Scores for Each Feature in Random Forest
Pearson Correlation Coefficient Between Features
创新类型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
突破式 86.04 79.05 82.40 80.03 76.20 78.06
延续式 80.94 87.40 84.05 77.25 80.96 79.06
Random Forest Performance(Remove Highly Relevant Features)
[1] 修国义, 韩佳璇, 陈晓华. 科技人才集聚对中国区域科技创新效率的影响——基于超越对数随机前沿距离函数模型[J]. 科技进步与对策, 2017, 34(19): 36-40.
[1] (Xiu Guoyi, Han Jiaxuan, Chen Xiaohua. Research on the Influence of Technological Talent Agglomeration to China’s Regional Sic-Tech Innovation Efficiency[J]. Science & Technology Progress and Policy, 2017, 34(19): 36-40.)
[2] 周敏丹. 人力资本供给、工作技能需求与过度教育[J]. 世界经济, 2021, 44(7): 79-103.
[2] (Zhou Mindan. Human Capital Supply, Skills Demand and Overeducation[J]. The Journal of World Economy, 2021, 44(7): 79-103.)
[3] 李石纯, 杨婧. 为加快建设世界重要人才中心和创新高地贡献高校力量[J]. 中国高等教育, 2022(7): 19-21.
[3] (Li Shichun, Yang Jing. Contribute to the Construction of an Important Talent Center and an Innovative Highland in the World[J]. China Higher Education, 2022(7): 19-21.)
[4] Voss G B, Voss Z G. Strategic Orientation and Firm Performance in an Artistic Environment[J]. Journal of Marketing, 2000, 64(1): 67-83.
doi: 10.1509/jmkg.64.1.67.17993
[5] Funk R J, Owen-Smith J. A Dynamic Network Measure of Technological Change[J]. Management Science, 2017, 63(3): 791-817.
doi: 10.1287/mnsc.2015.2366
[6] Wu L F, Wang D S, Evans J A. Large Teams Develop and Small Teams Disrupt Science and Technology[J]. Nature, 2019, 566(7744): 378-382.
doi: 10.1038/s41586-019-0941-9
[7] Coff R, Kryscynski D. Invited Editorial: Drilling for Micro-Foundations of Human Capital-Based Competitive Advantages[J]. Journal of Management, 2011, 37(5): 1429-1443.
doi: 10.1177/0149206310397772
[8] Hammarfelt B, Rushforth A D. Indicators as Judgment Devices: An Empirical Study of Citizen Bibliometrics in Research Evaluation[J]. Research Evaluation, 2017, 26(3): 169-180.
doi: 10.1093/reseval/rvx018
[9] Dyer J H, Singh H. The Relational View: Cooperative Strategy and Sources of Interorganizational Competitive Advantage[J]. Academy of Management Review, 1998, 23(4): 660-679.
doi: 10.2307/259056
[10] 刘向, 刘香, 余博文. 创新二重性视角下明星发明人类型的早期识别[J]. 数据分析与知识发现, 2023, 7(2): 119-128.
[10] (Liu Xiang, Liu Xiang, Yu Bowen. Early Identification of Star Inventor Types in the Perspective of Innovation Duality[J]. Data Analysis and Knowledge Discovery, 2023, 7(2): 119-128.)
[11] Zia M A, Zhang Z B, Li G D, et al. Prediction of Rising Venues in Citation Networks[J]. Journal of Advanced Computational Intelligence and Intelligent Informatics, 2017, 21(4): 650-658.
doi: 10.20965/jaciii.2017.p0650
[12] 田瑞强, 刘洢颖, 姚长青, 等. 基于专利文献的创新科技人才识别研究[J]. 情报杂志, 2018, 37(8): 71-77.
[12] (Tian Ruiqiang, Liu Yiying, Yao Changqing, et al. A Study of Identification Innovative Technical Talents from Patent[J]. Journal of Intelligence, 2018, 37(8): 71-77.)
[13] Hess A M, Rothaermel F T. When are Assets Complementary? Star Scientists, Strategic Alliances, and Innovation in the Pharmaceutical Industry[J]. Strategic Management Journal, 2011, 32(8): 895-909.
doi: 10.1002/smj.v32.8
[14] Kehoe R R, Tzabbar D. Lighting the Way or Stealing the Shine? An Examination of the Duality in Star Scientists’ Effects on Firm Innovative Performance[J]. Strategic Management Journal, 2015, 36(5): 709-727.
doi: 10.1002/smj.2015.36.issue-5
[15] 刘学元, 丁雯婧, 赵先德. 企业创新网络中关系强度、吸收能力与创新绩效的关系研究[J]. 南开管理评论, 2016, 19(1): 30-42.
[15] (Liu Xueyuan, Ding Wenjing, Zhao Xiande. Firm’s Strength of Ties within Innovation Network, Absorptive Capacity and Innovation Performance in the Chinese Manufacturing Industries[J]. Nankai Business Review, 2016, 19(1): 30-42.)
[16] 解学梅, 左蕾蕾. 企业协同创新网络特征与创新绩效: 基于知识吸收能力的中介效应研究[J]. 南开管理评论, 2013, 16(3): 47-56.
[16] (Xie Xuemei, Zuo Leilei. Characteristics of Collaborative Innovation Networks and Innovation Performance of Firms: The Mediating Effect of Knowledge Absorptive Capacity[J]. Nankai Business Review, 2013, 16(3): 47-56.)
[17] Hu D, She M Y, Ye L F, et al. The More the Merrier? Inventor Team Size, Diversity, and Innovation Quality[J]. Science and Public Policy, 2021, 48(4): 508-520.
doi: 10.1093/scipol/scab033
[18] Patel D, Ward M R. Using Patent Citation Patterns to Infer Innovation Market Competition[J]. Research Policy, 2011, 40(6): 886-894.
doi: 10.1016/j.respol.2011.03.006
[19] Albert M B, Avery D, Narin F, et al. Direct Validation of Citation Counts as Indicators of Industrially Important Patents[J]. Research Policy, 1991, 20(3): 251-259.
doi: 10.1016/0048-7333(91)90055-U
[20] Castriotta M, di Guardo M C. Disentangling the Automotive Technology Structure: A Patent Co-Citation Analysis[J]. Scientometrics, 2016, 107(2): 819-837.
doi: 10.1007/s11192-016-1862-0
[21] 王思培, 韩涛. 基于随机森林算法的潜在高价值专利预测方法研究[J]. 情报科学, 2020, 38(5): 120-125.
[21] (Wang Sipei, Han Tao. Prediction Method of Potential High-value Patents Based on Random Forest Algorithm[J]. Information Science, 2020, 38(5): 120-125.)
[22] Weis J W, Jacobson J M. Learning on Knowledge Graph Dynamics Provides an Early Warning of Impactful Research[J]. Nature Biotechnology, 2021, 39(10): 1300-1307.
doi: 10.1038/s41587-021-00907-6
[23] Daud A, Ahmad M, Malik M S I, et al. Using Machine Learning Techniques for Rising Star Prediction in Co-Author Network[J]. Scientometrics, 2015, 102(2): 1687-1711.
doi: 10.1007/s11192-014-1455-8
[24] Zhu L, Zhu D H, Wang X F, et al. An Integrated Solution for Detecting Rising Technology Stars in Co-Inventor Networks[J]. Scientometrics, 2019, 121(1): 137-172.
doi: 10.1007/s11192-019-03194-w
[25] Gross P L, Gross E M. College Libraries and Chemical Education[J]. Science, 1927, 66(1713): 385-389.
pmid: 17782476
[26] Lee K, Lee S. Patterns of Technological Innovation and Evolution in the Energy Sector: A Patent-Based Approach[J]. Energy Policy, 2013, 59: 415-432.
doi: 10.1016/j.enpol.2013.03.054
[27] Yayavaram S, Chen W R. Changes in Firm Knowledge Couplings and Firm Innovation Performance: The Moderating Role of Technological Complexity[J]. Strategic Management Journal, 2015, 36(3): 377-396.
doi: 10.1002/smj.2015.36.issue-3
[28] Ali J, Khan R, Ahmad N, et al. Random Forests and Decision Trees[J]. International Journal of Computer Science Issues, 2012, 9(5): 272.
[29] Breiman L, Friedman J H, Olshen R A, et al. Classification and Regression Trees[J]. Biometrics, 1984, 40(3): 874.
[30] Gong X, Sanfey A G. Social Rank and Social Cooperation: Impact of Social Comparison Processes on Cooperative Decision-Making[J]. PLoS One, 2017, 12(4): e0175472.
doi: 10.1371/journal.pone.0175472
[31] Meho L I, Yang K. Impact of Data Sources on Citation Counts and Rankings of LIS Faculty: Web of Science Versus Scopus and Google Scholar[J]. Journal of the American Society for Information Science and Technology, 2007, 58(13): 2105-2125.
doi: 10.1002/asi.v58:13
[32] Li E Y, Liao C H, Yen H R. Co-Authorship Networks and Research Impact: A Social Capital Perspective[J]. Research Policy, 2013, 42(9): 1515-1530.
doi: 10.1016/j.respol.2013.06.012
[33] 毛大胜, 周菁菁. 参考文献数量与论文质量的关系[J]. 中国科技期刊研究, 2003, 14(1): 34-36.
[33] (Mao Dasheng, Zhou Jingjing. The Relationship Between the Number of References and the Quality of Papers[J]. Chinese Journal of Scientific and Technical Periodicals, 2003, 14(1): 34-36.)
[34] Yan E J, Ding Y. Discovering Author Impact: A PageRank Perspective[J]. Information Processing and Management, 2011, 47(1): 125-134.
doi: 10.1016/j.ipm.2010.05.002
[35] 宋志红, 史玉英, 李冬梅. 学术论文质量特征对明星作者网络位置的影响——以1990—2012年“创新网络”领域的文献为例[J]. 科学学研究, 2014, 32(5): 660-668.
[35] (Song Zhihong, Shi Yuying, Li Dongmei. The Effects of Academic Paper Quality on the Network Positions of Stars in Co-Authorship Network—The Case of “Innovation Network” Literature(1990-2012)[J]. Studies in Science of Science, 2014, 32(5): 660-668.)
[36] 王颖. 当前接近中心性对关键研发者创造力的影响[J]. 情报杂志, 2016, 35(12): 169-174, 110.
[36] (Wang Ying. The Effect of Current Closeness Centrality on the Creativity of Key Inventors[J]. Journal of Intelligence, 2016, 35(12): 169-174, 110.)
[37] Gittelman M, Kogut B. Does Good Science Lead to Valuable Knowledge? Biotechnology Firms and the Evolutionary Logic of Citation Patterns[J]. Management Science, 2003, 49(4): 366-382.
doi: 10.1287/mnsc.49.4.366.14420
[38] Rice J A. Mathematical Statistics and Data Analysis[M]. London: Brooks/Cole, 2006.
[1] Feng Lijie, Liu Kehui, Wang Jinfeng, Zhang Ke, Zhang Shibin. Identifying Opportunities Based on Knowledge Network and Multidimensional Map of Technology Innovation[J]. 数据分析与知识发现, 2023, 7(8): 62-77.
[2] Liu Xiang, Liu Xiang, Yu Bowen. Early Identification of Star Inventor Types in the Perspective of Innovation Duality[J]. 数据分析与知识发现, 2023, 7(2): 119-128.
[3] Xie Zhen, Ma Jianxia, Hu Wenjing. Mapping and Analyzing Personal Academic Trajectory from Multiple Dimensions[J]. 数据分析与知识发现, 2023, 7(2): 129-140.
[4] Pan Yiru, Mao Jin, Li Gang. Knowledge Diffusion Characteristics of Highly Disruptive Patents Based on Citation Network[J]. 数据分析与知识发现, 2023, 7(10): 1-14.
[5] Han Fang, Zhang Shengtai, Feng Lingzi, Yuan Junpeng. Identifying Breakthrough Patent Topics by Measuring Technological Convergence——Case Study of Solar PV Domain[J]. 数据分析与知识发现, 2021, 5(12): 137-147.
[6] Liu Junwan,Yang Bo,Wang Feifei. Ranking Scholarly Impacts Based on Citations and Academic Similarity[J]. 数据分析与知识发现, 2018, 2(4): 59-70.
[7] Chen Yunwei,Zhang Ruihong. Comparing on Community Detection Algorithms for Information Mining[J]. 数据分析与知识发现, 2018, 2(10): 84-94.
[8] Li Shuying,Fang Shu. Review of Data Analysis Methods in Measuring Technology Fusion and Trend[J]. 数据分析与知识发现, 2017, 1(7): 2-12.
[9] Qin Xiaohui, Le Xiaoqiu. Topic Sources and Trends Tracking Towards Citation Network of Single Paper[J]. 现代图书情报技术, 2015, 31(9): 52-59.
[10] Zhang Yun, Hua Weina, Yuan Shunbo, Su Baoduo. Research on the Themes Dynamic Evolutions of the Patent Analysis Papers from WoS Database[J]. 现代图书情报技术, 2015, 31(1): 17-23.
[11] Han Hongqi,Gui Jie,Xu Shuo,Liu Yuqin. Technical Strength Evaluation Method Based on Patent Text Data[J]. 现代图书情报技术, 2014, 30(1): 66-71.
[12] Ku Liping. Reviews of the Open Data Metric Studies:An Alternative Metric (Altmetrics) for Calculating the Online User Behavior and the Scientific Community Impact[J]. 现代图书情报技术, 2013, (6): 1-8.
[13] Wang Li, Zhang Dongrong, Zhang Xiaohui, Yang Xiaowei, Wu Ming. Realization of Technology/Effect Maps Generating Based on Subject Automatic Indexing[J]. 现代图书情报技术, 2013, (5): 80-86.
[14] Liu Chunjiang, Liu Danjun, Wen Yi. Design and Implementation of Online Patent Analysis System Based on Solr[J]. 现代图书情报技术, 2013, 29(2): 88-92.
[15] Ku Liping. PatentRank Algorithm——A Study of Using Cited Time and Citation Network to Calculate U.S. Patents[J]. 现代图书情报技术, 2011, 27(6): 14-19.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn