Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (12): 40-51     https://doi.org/10.11925/infotech.2096-3467.2023.0219
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
突破式创新发明人的提前发现:基于专利知识图动态学习的预测*
余博文,刘向()
华中师范大学信息管理学院 武汉 430079
Detecting Inventors of Breakthrough Innovation Based on Dynamic Learning of Patent Knowledge Graph
Yu Bowen,Liu Xiang()
School of Information Management, Central China Normal University, Wuhan 430079, China
全文: PDF (1397 KB)   HTML ( 10
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】利用发明人的合作和引用关系特征,提前发现突破式创新发明人。【方法】首先定义突破式创新发明人的度量指标,然后通过分析挖掘发明人合作和引用关系特征,基于专利知识图动态学习建立预测发明人未来创新类型的统计学习模型,实现突破式创新发明人的提前发现,最后对突破式创新发明人的关键特征进行分析。【结果】基于真实专利数据的实验结果表明,使用随机森林模型预测的准确率达到83.51%,对突破式创新发明人和延续式创新发明人预测的准确率分别为85.99%和81.40%。模型在预测突破式创新发明人时,发明人的合作和引用相关特征的重要性评分排名均靠前。【局限】未完全解决专利的技术创新度量在零值附近的歧义问题,本文对因该问题导致无法识别创新类型的发明人进行过滤。【结论】本文模型可以通过多维度特征对突破式创新发明人进行提前发现,并且在预测发明人未来创新类型时,发现发明人的合作和引用相关特征均很重要。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
余博文
刘向
关键词 专利分析引用网络合作网络突破式创新统计学习    
Abstract

[Objective] This paper aims to identify breakthrough innovation inventors through their collaboration and citation features. [Methods] First, we defined the metrics of breakthrough innovation inventors. Then, we examined the features of cooperation and citation relationship of inventors. Third, we established a statistical learning model to predict their future innovations based on the dynamic learning of the patent knowledge graph. Finally, we analyzed the characteristics of breakthrough innovation inventors. [Results] We examined our model with patent data and found its overall prediction accuracy reached 83.51%. The model’s accuracy for predicting breakthrough and continuation innovation inventors reached 85.99% and 81.40%, respectively. While predicting the inventors, their collaboration and citation-related features were ranked high. [Limitations] The ambiguity of the technological innovation metric of patents around the value of 0 was not fully resolved. We filtered inventors with unidentified categories due to this problem. [Conclusions] The proposed model could discover breakthrough innovation inventors earlier.

Key wordsPatent Analysis    Citation Network    Cooperative Network    Breakthrough Innovation    Statistical Learning
收稿日期: 2023-03-17      出版日期: 2024-01-08
ZTFLH:  G305  
  TP181  
基金资助:*国家自然科学基金项目(71673106)
通讯作者: 刘向,ORCID:0000-0003-4315-2699,E-mail:xiangliu@ccnu.edu.cn。   
引用本文:   
余博文, 刘向. 突破式创新发明人的提前发现:基于专利知识图动态学习的预测*[J]. 数据分析与知识发现, 2023, 7(12): 40-51.
Yu Bowen, Liu Xiang. Detecting Inventors of Breakthrough Innovation Based on Dynamic Learning of Patent Knowledge Graph. Data Analysis and Knowledge Discovery, 2023, 7(12): 40-51.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2023.0219      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I12/40
Fig.1  专利知识图
Fig.2  动态专利引用网络的时间切片示例
Fig.3  算法流程图
特征 字段 描述
发明人合作关系特征 StgCoopsNum 发明人合作者总数
StgAvgTeamSize 平均合作规模
StgAvgCoopsCitedNum 合作者平均被引量
StgSumCoopsCitedNum 合作者的被引量之和
发明人引用关系特征 StgPNum 发明人的专利数量
StgSumPCitedNum 发明人公开专利的被引量之和
StgMaxPCitedNum 发明人公开专利的最高被引量
StgAvgPCitedNum 发明人公开专利的平均被引量
StgSumPCitedRank 发明人公开专利的被引量排名
StgAvgPRefNum 发明人公开专利的参考专利数目之和
TotSumPCitedNum 发明人公开所有专利的总被引量
TotHindex 发明人的H指数
发明人网络结构特征 CoopNWCloseness 发明人合作网络节点接近中心性
CoopNWClustering 发明人合作网络节点聚类系数
CoopNWPR 发明人合作网络节点PageRank值
CiteNWClustering 发明人引用网络节点聚类系数
CiteNWPR 发明人引用网络PageRank值
Table 1  预测模型的特征
模型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
逻辑回归 69.99 69.87 71.57 67.83 67.84 69.76
非线性SVM 70.15 70.03 71.66 69.98% 69.99 71.48
随机森林 83.51 83.47 83.69 78.57 78.57 78.73
Table 2  两种分类情形实验结果
创新类型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
突破式 85.99 79.71 82.73 80.82 74.97 77.78
延续式 81.40 87.24 84.22 76.63 82.19 79.31
Table 3  随机森林模型实验结果
特征 重要性
评分
特征 重要性
评分
CoopNWPR 0.107 9 StgSumCoopsCitedNum 0.061 2
CiteNWPR 0.104 6 StgAvgTeamSize 0.052 5
StgSumPCitedRank 0.085 5 TotSumPCitedNum 0.047 8
StgAvgPRefNum 0.079 1 StgCoopsNum 0.038 7
StgMaxPCitedNum 0.075 7 CiteNWClustering 0.029 4
StgSumPCitedNum 0.073 0 TotHindex 0.021 3
CoopNWCloseness 0.071 9 StgPNum 0.015 1
StgAvgPCitedNum 0.068 9 CoopNWClustering 0.001 2
StgAvgCoopsCitedNum 0.066 0
Table 4  随机森林模型各特征重要性评分
Fig.4  各特征间皮尔逊相关系数
创新类型 MaxCD5分类 MeanCD5分类
A/% R/% P/% A/% R/% P/%
突破式 86.04 79.05 82.40 80.03 76.20 78.06
延续式 80.94 87.40 84.05 77.25 80.96 79.06
Table 5  随机森林模型实验结果(去掉高相关性特征)
[1] 修国义, 韩佳璇, 陈晓华. 科技人才集聚对中国区域科技创新效率的影响——基于超越对数随机前沿距离函数模型[J]. 科技进步与对策, 2017, 34(19): 36-40.
[1] (Xiu Guoyi, Han Jiaxuan, Chen Xiaohua. Research on the Influence of Technological Talent Agglomeration to China’s Regional Sic-Tech Innovation Efficiency[J]. Science & Technology Progress and Policy, 2017, 34(19): 36-40.)
[2] 周敏丹. 人力资本供给、工作技能需求与过度教育[J]. 世界经济, 2021, 44(7): 79-103.
[2] (Zhou Mindan. Human Capital Supply, Skills Demand and Overeducation[J]. The Journal of World Economy, 2021, 44(7): 79-103.)
[3] 李石纯, 杨婧. 为加快建设世界重要人才中心和创新高地贡献高校力量[J]. 中国高等教育, 2022(7): 19-21.
[3] (Li Shichun, Yang Jing. Contribute to the Construction of an Important Talent Center and an Innovative Highland in the World[J]. China Higher Education, 2022(7): 19-21.)
[4] Voss G B, Voss Z G. Strategic Orientation and Firm Performance in an Artistic Environment[J]. Journal of Marketing, 2000, 64(1): 67-83.
doi: 10.1509/jmkg.64.1.67.17993
[5] Funk R J, Owen-Smith J. A Dynamic Network Measure of Technological Change[J]. Management Science, 2017, 63(3): 791-817.
doi: 10.1287/mnsc.2015.2366
[6] Wu L F, Wang D S, Evans J A. Large Teams Develop and Small Teams Disrupt Science and Technology[J]. Nature, 2019, 566(7744): 378-382.
doi: 10.1038/s41586-019-0941-9
[7] Coff R, Kryscynski D. Invited Editorial: Drilling for Micro-Foundations of Human Capital-Based Competitive Advantages[J]. Journal of Management, 2011, 37(5): 1429-1443.
doi: 10.1177/0149206310397772
[8] Hammarfelt B, Rushforth A D. Indicators as Judgment Devices: An Empirical Study of Citizen Bibliometrics in Research Evaluation[J]. Research Evaluation, 2017, 26(3): 169-180.
doi: 10.1093/reseval/rvx018
[9] Dyer J H, Singh H. The Relational View: Cooperative Strategy and Sources of Interorganizational Competitive Advantage[J]. Academy of Management Review, 1998, 23(4): 660-679.
doi: 10.2307/259056
[10] 刘向, 刘香, 余博文. 创新二重性视角下明星发明人类型的早期识别[J]. 数据分析与知识发现, 2023, 7(2): 119-128.
[10] (Liu Xiang, Liu Xiang, Yu Bowen. Early Identification of Star Inventor Types in the Perspective of Innovation Duality[J]. Data Analysis and Knowledge Discovery, 2023, 7(2): 119-128.)
[11] Zia M A, Zhang Z B, Li G D, et al. Prediction of Rising Venues in Citation Networks[J]. Journal of Advanced Computational Intelligence and Intelligent Informatics, 2017, 21(4): 650-658.
doi: 10.20965/jaciii.2017.p0650
[12] 田瑞强, 刘洢颖, 姚长青, 等. 基于专利文献的创新科技人才识别研究[J]. 情报杂志, 2018, 37(8): 71-77.
[12] (Tian Ruiqiang, Liu Yiying, Yao Changqing, et al. A Study of Identification Innovative Technical Talents from Patent[J]. Journal of Intelligence, 2018, 37(8): 71-77.)
[13] Hess A M, Rothaermel F T. When are Assets Complementary? Star Scientists, Strategic Alliances, and Innovation in the Pharmaceutical Industry[J]. Strategic Management Journal, 2011, 32(8): 895-909.
doi: 10.1002/smj.v32.8
[14] Kehoe R R, Tzabbar D. Lighting the Way or Stealing the Shine? An Examination of the Duality in Star Scientists’ Effects on Firm Innovative Performance[J]. Strategic Management Journal, 2015, 36(5): 709-727.
doi: 10.1002/smj.2015.36.issue-5
[15] 刘学元, 丁雯婧, 赵先德. 企业创新网络中关系强度、吸收能力与创新绩效的关系研究[J]. 南开管理评论, 2016, 19(1): 30-42.
[15] (Liu Xueyuan, Ding Wenjing, Zhao Xiande. Firm’s Strength of Ties within Innovation Network, Absorptive Capacity and Innovation Performance in the Chinese Manufacturing Industries[J]. Nankai Business Review, 2016, 19(1): 30-42.)
[16] 解学梅, 左蕾蕾. 企业协同创新网络特征与创新绩效: 基于知识吸收能力的中介效应研究[J]. 南开管理评论, 2013, 16(3): 47-56.
[16] (Xie Xuemei, Zuo Leilei. Characteristics of Collaborative Innovation Networks and Innovation Performance of Firms: The Mediating Effect of Knowledge Absorptive Capacity[J]. Nankai Business Review, 2013, 16(3): 47-56.)
[17] Hu D, She M Y, Ye L F, et al. The More the Merrier? Inventor Team Size, Diversity, and Innovation Quality[J]. Science and Public Policy, 2021, 48(4): 508-520.
doi: 10.1093/scipol/scab033
[18] Patel D, Ward M R. Using Patent Citation Patterns to Infer Innovation Market Competition[J]. Research Policy, 2011, 40(6): 886-894.
doi: 10.1016/j.respol.2011.03.006
[19] Albert M B, Avery D, Narin F, et al. Direct Validation of Citation Counts as Indicators of Industrially Important Patents[J]. Research Policy, 1991, 20(3): 251-259.
doi: 10.1016/0048-7333(91)90055-U
[20] Castriotta M, di Guardo M C. Disentangling the Automotive Technology Structure: A Patent Co-Citation Analysis[J]. Scientometrics, 2016, 107(2): 819-837.
doi: 10.1007/s11192-016-1862-0
[21] 王思培, 韩涛. 基于随机森林算法的潜在高价值专利预测方法研究[J]. 情报科学, 2020, 38(5): 120-125.
[21] (Wang Sipei, Han Tao. Prediction Method of Potential High-value Patents Based on Random Forest Algorithm[J]. Information Science, 2020, 38(5): 120-125.)
[22] Weis J W, Jacobson J M. Learning on Knowledge Graph Dynamics Provides an Early Warning of Impactful Research[J]. Nature Biotechnology, 2021, 39(10): 1300-1307.
doi: 10.1038/s41587-021-00907-6
[23] Daud A, Ahmad M, Malik M S I, et al. Using Machine Learning Techniques for Rising Star Prediction in Co-Author Network[J]. Scientometrics, 2015, 102(2): 1687-1711.
doi: 10.1007/s11192-014-1455-8
[24] Zhu L, Zhu D H, Wang X F, et al. An Integrated Solution for Detecting Rising Technology Stars in Co-Inventor Networks[J]. Scientometrics, 2019, 121(1): 137-172.
doi: 10.1007/s11192-019-03194-w
[25] Gross P L, Gross E M. College Libraries and Chemical Education[J]. Science, 1927, 66(1713): 385-389.
pmid: 17782476
[26] Lee K, Lee S. Patterns of Technological Innovation and Evolution in the Energy Sector: A Patent-Based Approach[J]. Energy Policy, 2013, 59: 415-432.
doi: 10.1016/j.enpol.2013.03.054
[27] Yayavaram S, Chen W R. Changes in Firm Knowledge Couplings and Firm Innovation Performance: The Moderating Role of Technological Complexity[J]. Strategic Management Journal, 2015, 36(3): 377-396.
doi: 10.1002/smj.2015.36.issue-3
[28] Ali J, Khan R, Ahmad N, et al. Random Forests and Decision Trees[J]. International Journal of Computer Science Issues, 2012, 9(5): 272.
[29] Breiman L, Friedman J H, Olshen R A, et al. Classification and Regression Trees[J]. Biometrics, 1984, 40(3): 874.
[30] Gong X, Sanfey A G. Social Rank and Social Cooperation: Impact of Social Comparison Processes on Cooperative Decision-Making[J]. PLoS One, 2017, 12(4): e0175472.
doi: 10.1371/journal.pone.0175472
[31] Meho L I, Yang K. Impact of Data Sources on Citation Counts and Rankings of LIS Faculty: Web of Science Versus Scopus and Google Scholar[J]. Journal of the American Society for Information Science and Technology, 2007, 58(13): 2105-2125.
doi: 10.1002/asi.v58:13
[32] Li E Y, Liao C H, Yen H R. Co-Authorship Networks and Research Impact: A Social Capital Perspective[J]. Research Policy, 2013, 42(9): 1515-1530.
doi: 10.1016/j.respol.2013.06.012
[33] 毛大胜, 周菁菁. 参考文献数量与论文质量的关系[J]. 中国科技期刊研究, 2003, 14(1): 34-36.
[33] (Mao Dasheng, Zhou Jingjing. The Relationship Between the Number of References and the Quality of Papers[J]. Chinese Journal of Scientific and Technical Periodicals, 2003, 14(1): 34-36.)
[34] Yan E J, Ding Y. Discovering Author Impact: A PageRank Perspective[J]. Information Processing and Management, 2011, 47(1): 125-134.
doi: 10.1016/j.ipm.2010.05.002
[35] 宋志红, 史玉英, 李冬梅. 学术论文质量特征对明星作者网络位置的影响——以1990—2012年“创新网络”领域的文献为例[J]. 科学学研究, 2014, 32(5): 660-668.
[35] (Song Zhihong, Shi Yuying, Li Dongmei. The Effects of Academic Paper Quality on the Network Positions of Stars in Co-Authorship Network—The Case of “Innovation Network” Literature(1990-2012)[J]. Studies in Science of Science, 2014, 32(5): 660-668.)
[36] 王颖. 当前接近中心性对关键研发者创造力的影响[J]. 情报杂志, 2016, 35(12): 169-174, 110.
[36] (Wang Ying. The Effect of Current Closeness Centrality on the Creativity of Key Inventors[J]. Journal of Intelligence, 2016, 35(12): 169-174, 110.)
[37] Gittelman M, Kogut B. Does Good Science Lead to Valuable Knowledge? Biotechnology Firms and the Evolutionary Logic of Citation Patterns[J]. Management Science, 2003, 49(4): 366-382.
doi: 10.1287/mnsc.49.4.366.14420
[38] Rice J A. Mathematical Statistics and Data Analysis[M]. London: Brooks/Cole, 2006.
[1] 冯立杰, 刘可辉, 王金凤, 张珂, 张世斌. 基于知识网络与多维技术创新地图的技术机会识别路径研究与应用*[J]. 数据分析与知识发现, 2023, 7(8): 62-77.
[2] 关鹏,王曰芬,傅柱,靳嘉林. 基于专利合作网络的研发团队识别及创新产出影响研究*[J]. 数据分析与知识发现, 2022, 6(5): 99-111.
[3] 鲁云蒙,刘铁忠. 基于知识关联性的科研合作网络隐性知识扩散模型研究:以重大科技工程为例*[J]. 数据分析与知识发现, 2021, 5(9): 10-20.
[4] 关鹏,王曰芬,靳嘉林,傅柱. 专利合作视角下技术创新合作网络演化分析——以国内语音识别技术领域为例*[J]. 数据分析与知识发现, 2021, 5(1): 112-127.
[5] 陈云伟, 张瑞红. 用于情报挖掘的典型网络社团划分算法比较研究*[J]. 数据分析与知识发现, 2018, 2(10): 84-94.
[6] 余传明, 龚雨田, 赵晓莉, 安璐. 基于多特征融合的金融领域科研合作推荐研究*[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
[7] 李姝影, 方曙. 测度技术融合与趋势的数据分析方法研究进展*[J]. 数据分析与知识发现, 2017, 1(7): 2-12.
[8] 吕伟民, 王小梅, 韩涛. 结合链路预测和ET机器学习的科研合作推荐方法研究*[J]. 数据分析与知识发现, 2017, 1(4): 38-45.
[9] 张云, 华薇娜, 袁顺波, 苏保朵. WoS数据库中专利分析论文的主题动态演进研究[J]. 现代图书情报技术, 2015, 31(1): 17-23.
[10] 韩红旗,桂婕,徐硕,刘玉琴. 基于专利文本数据的技术实力评价方法*[J]. 现代图书情报技术, 2014, 30(1): 66-71.
[11] 顾立平. 开放数据计量研究综述:计算网络用户行为和科学社群影响力的Altmetrics计量[J]. 现代图书情报技术, 2013, (6): 1-8.
[12] 王丽, 张冬荣, 张晓辉, 杨小薇, 吴鸣. 利用主题自动标引生成技术功效矩阵[J]. 现代图书情报技术, 2013, (5): 80-86.
[13] 刘春江, 刘丹军, 文奕. 基于Solr的专利在线分析系统的设计与实现[J]. 现代图书情报技术, 2013, 29(2): 88-92.
[14] 顾立平. 专利排名算法——运用引用次数与引文网络计算美国专利的研究[J]. 现代图书情报技术, 2011, 27(6): 14-19.
[15] 陈颖, 张晓林. 专利中技术词和功效词识别方法研究[J]. 现代图书情报技术, 2011, 27(12): 24-30.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn