Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (6): 15-25     https://doi.org/10.11925/infotech.2096-3467.2022.0361
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于图神经网络的技术识别链接预测方法研究*
许鑫(),李倩,姚占雷
华东师范大学经济与管理学部 上海 200062
Technology Recognition and Link Prediction Method Based on GNN
Xu Xin(),Li Qian,Yao Zhanlei
Faculy of Economics and Management, East China Normal University, Shanghai 200062, China
全文: PDF (1064 KB)   HTML ( 31
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 通过融合了时间特征的专利IPC共现网络,训练图神经网络模型实现链接预测方法,为技术发现和知识供给提供参考。【方法】 采集“隐私保护”专利数据构建专利IPC共现网络,构建节点的时间分布、时间稳定性和时间关注度特征,训练GraphSAGE模型,得到IPC节点表示及其之间的链接预测得分,为技术机会挖掘提供辅助和支持。【结果】 基于图神经网络模型的链接预测方法相对于基于节点相似性的传统链接预测方法以及图游走算法Node2Vec在AUC指标上提升约30%。【局限】 图神经网络作为深度学习模型在训练耗时上存在一定劣势。【结论】 基于图神经网络的链接预测方法具有较高的预测精度,结合时间特征后能够捕捉节点的动态特征,为技术发现等任务提供有价值的参考。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
许鑫
李倩
姚占雷
关键词 链接预测图神经网络时间特征技术发现    
Abstract

[Objective] This paper integrates time features into a patent IPC co-occurrence network and trains the GNN model for link prediction. It aims to provide a reference for technology discovery and knowledge supply. [Methods] First, we collected the patent data on “privacy protection” to construct an IPC co-occurrence network. Then, we assigned time distribution, stability, and attention features to the network nodes. Third, we trained the GraphSAGE model to obtain the IPC nodes’ representation and predict the link score between them. It provides assistance and support for technology opportunity mining. [Results] Compared with the traditional link prediction method based on node similarity and the Node2Vec, the proposed model achieved a 30% improvement in the AUC metric. [Limitations] As a deep learning model, GNN has some disadvantages in training time. [Conclusions] Our new link prediction method exhibits high prediction accuracy. Combined with the time characteristics, it can capture the dynamic characteristics of nodes and provide valuable insights for technology discovery and other tasks.

Key wordsLink Prediction    Graph Neural Network    Time Features    Technology Discovery
收稿日期: 2022-04-18      出版日期: 2023-08-09
ZTFLH:  G35  
基金资助:* 上海市2021年度“科技创新行动计划”软科学重点项目(21692195900)
通讯作者: 许鑫,ORCID:0000-0001-7020-3135,E-mail: xxu@infor.ecnu.edu.cn。   
引用本文:   
许鑫, 李倩, 姚占雷. 基于图神经网络的技术识别链接预测方法研究*[J]. 数据分析与知识发现, 2023, 7(6): 15-25.
Xu Xin, Li Qian, Yao Zhanlei. Technology Recognition and Link Prediction Method Based on GNN. Data Analysis and Knowledge Discovery, 2023, 7(6): 15-25.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0361      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I6/15
专利IPC共现网络
节点数 4 595
边数 45 436
平均聚集系数 0.683 5
Table 1  数据集基本属性
Fig.1  模型参数影响
Fig.2  基线模型对比结果
方法

训练集比例
60% 65% 70% 75% 80% 85% 90%
AA 0.670 7 0.681 5 0.657 4 0.717 7 0.658 7 0.746 0 0.690 5
CN 0.673 7 0.684 9 0.588 0 0.703 3 0.622 8 0.616 0 0.595 2
Jaccard 0.681 7 0.694 2 0.624 0 0.711 5 0.694 6 0.656 0 0.726 2
PA 0.658 7 0.650 7 0.645 4 0.665 1 0.664 7 0.698 4 0.595 2
RA 0.661 7 0.654 1 0.664 0 0.612 4 0.712 6 0.616 0 0.642 9
Node2Vec 0.834 6 0.824 9 0.823 8 0.826 4 0.841 4 0.833 0 0.838 2
GraphSAGE_T 0.950 1 0.960 2 0.947 9 0.951 8 0.953 4 0.955 3 0.940 4
Table 2  链接预测实验结果AUC对比
Fig.3  训练集占比对耗时影响-基线模型
Fig.4  训练集占比对耗时影响-基线模型和GraphSAGE_T
Model 年份
分布
特征
时间
关注度
特征
时间
稳定性
特征
AUC
GraphSAGE_T_V1 × 0.936 1
GraphSAGE_T_V2 × 0.931 5
GraphSAGE_T_V3 × 0.920 9
GraphSAGE_T_V4 × × × 0.884 1
GraphSAGE_T 0.960 2
Table 3  时间特征对AUC影响
Fig.5  强化型链接预测效果
IPC组合 链接分数
<G06F21/62, G06F21/60> 0.817 1
<G06F21/62, H04L29/06> 0.776 1
<G06F21/62, G06F21/64> 0.712 1
<G06F21/62, G06K9/62> 0.711 9
<G06F21/62, H04L29/08> 0.705 2
<G06F21/62, G06N3/08> 0.700 6
<G06F21/62, G06F21/32> 0.696 5
<G06F21/62, G06N3/04> 0.692 6
<G06F21/62, G06Q40/04> 0.682 5
<G06F21/62, G06F17/30> 0.674 9
<G06F21/62, G06Q20/38> 0.674 2
<G06F21/62, H04L9/08> 0.670 5
<G06F21/62, H04L9/32> 0.665 5
<G06F21/62, G06K9/00> 0.658 9
<G06F21/62, H04L9/00> 0.652 6
Table 4  G06F21/62强化型链接预测IPC组合
Fig.6  新生型链接预测效果
IPC组合 链接分数
<G06F21/62, G11B11/00> 0.865 7
<G06F21/62, G02B27/00> 0.835 7
<G06F21/62, G06T5/40> 0.806 3
<G06F21/62, G11B27/10> 0.762 5
<G06F21/62, G06F8/71> 0.751 5
<G06F21/62, G06F7/72> 0.731 4
<G06F21/62, G06F12/06> 0.729 5
<G06F21/62, B65F1/06> 0.717 3
<G06F21/62, G06T1/20> 0.702 1
<G06F21/62, A47B37/00> 0.694 6
<G06F21/62, G06T5/10> 0.691 4
<G06F21/62, G06F11/10> 0.689 5
<G06F21/62, A47C1/00> 0.687 6
<G06F21/62, A47B13/00> 0.685 0
<G06F21/62, G06F21/54> 0.681 7
Table 5  G06F21/62新生型链接预测IPC组合
Fig.7  衰退型链接预测效果
IPC组合 链接分数
<G06F21/62, G08B25/08> 0.145 7
<G06F21/62, G08B25/00> 0.179 6
<G06F21/62, H04W4/02> 0.216 8
<G06F21/62, H04L12/28> 0.235 5
<G06F21/62, H04W8/24> 0.252 6
<G06F21/62, H04W4/14> 0.255 4
<G06F21/62, H04N5/232> 0.272 1
<G06F21/62, H04M1/57> 0.272 1
Table 6  G06F21/62衰退型链接IPC组合
[1] 徐燕. 基于数据挖掘的网络链接预测研究[J]. 信息网络安全, 2017(6): 30-34.
[1] (Xu Yan. Research on Network Link Prediction Based on Data Mining[J]. Netinfo Security, 2017(6): 30-34.)
[2] Nassar H, Benson A R, Gleich D F. Pairwise Link Prediction[OL]. arXiv Preprint, arXiv: 1907.04503.
[3] 蔡彪, 李蕊岑, 吴媛媛. 相似性特征对链路预测的影响与增强[J]. 计算机应用, 2021, 41(9): 2569-2577.
doi: 10.11772/j.issn.1001-9081.2020111744
[3] (Cai Biao, Li Ruicen, Wu Yuanyuan. Impact and Enhancement of Similarity Features on Link Prediction[J]. Journal of Computer Applications, 2021, 41(9): 2569-2577.)
doi: 10.11772/j.issn.1001-9081.2020111744
[4] 宫雪, 崔雷. 基于医学主题词共现网络的链接预测研究[J]. 情报杂志, 2018, 37(1): 66-71.
[4] (Gong Xue, Cui Lei. Link Prediction in MeSH Terms Co-Occurring Networks[J]. Journal of Intelligence, 2018, 37(1): 66-71.)
[5] Nguyen-Thi A T, Nguyen P Q, Ngo T D, et al. Transfer AdaBoost SVM for Link Prediction in Newly Signed Social Networks Using Explicit and PNR Features[J]. Procedia Computer Science, 2015, 60: 332-341.
doi: 10.1016/j.procs.2015.08.135
[6] Hasan M A, Chaoji V, Salem S, et al. Link Prediction Using Supervised Learning[C]// Proceedings of the SDM’06 Workshop on Link Analysis, Counter-Terrorism and Security. 2006.
[7] Ou M D, Cui P, Pei J, et al. Asymmetric Transitivity Preserving Graph Embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1105-1114.
[8] Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710.
[9] Grover A, Leskovec J. Node2vec: Scalable Feature Learning for Networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855-864.
[10] Bruna J, Zaremba W, Szlam A, et al. Spectral Networks and Locally Connected Networks on Graphs[OL]. arXiv Preprint, arXiv: 1312.6203.
[11] Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv: 1609.02907.
[12] Hamilton W L, Ying R, Leskovec J. Inductive Representation Learning on Large Graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 1025-1035.
[13] Velickovic P, Cucurull G, Casanova A, et al. Graph Attention Networks[C]// Proceedings of the 6th International Conference on Learning Representations. 2018.
[14] 王慧, 乐孜纯, 龚轩, 等. 基于特征学习的链路预测模型TNTlink[J]. 计算机科学, 2020, 47(12): 245-251.
doi: 10.11896/jsjkx.190700020
[14] (Wang Hui, Le Zichun, Gong Xuan, et al. TNTlink Prediction Model Based on Feature Learning[J]. Computer Science, 2020, 47(12): 245-251.)
doi: 10.11896/jsjkx.190700020
[15] Chakraborty R, Das R, Chakraborty N. Link Prediction in Signed Networks[C]// Proceedings of the 31st ACM Conference on Hypertext and Social Media. New York: ACM, 2020: 235-236.
[16] 张艳红, 王宝会. 基于深度神经网络的社会媒体网络分析[J]. 计算机科学, 2016, 43(4): 252-255.
doi: 10.11896/j.issn.1002-137X.2016.04.051
[16] (Zhang Yanhong, Wang Baohui. Analysis of Social Media Networks Based on Deep Neural Networks[J]. Computer Science, 2016, 43(4): 252-255.)
doi: 10.11896/j.issn.1002-137X.2016.04.051
[17] 刘思, 刘海, 陈启买, 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017, 37(8): 2234-2239.
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[17] (Liu Si, Liu Hai, Chen Qimai, et al. Link Prediction Algorithm Based on Network Representation Learning and Random Walk[J]. Journal of Computer Applications, 2017, 37(8): 2234-2239.)
doi: 10.11772/j.issn.1001-9081.2017.08.2234
[18] Wang Z T, Lei Y, Li W J. Neighborhood Interaction Attention Network for Link Prediction[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 2153-2156.
[19] 柴庆凤, 翟东升, 蔡力伟, 等. 基于专利网络链接模型的技术链接机会预测方法研究[J]. 情报理论与实践, 2020, 43(12): 111-119.
doi: 10.16353/j.cnki.1000-7490.2020.12.017
[19] (Chai Qingfeng, Zhai Dongsheng, Cai Liwei, et al. Study on the Method of the Technology Link Opportunity Forecasting Based on Patent Network Link Model[J]. Information Studies :Theory & Application, 2020, 43(12): 111-119.)
doi: 10.16353/j.cnki.1000-7490.2020.12.017
[20] Fındık O, Özkaynak E. Link Prediction Based on Node Weighting in Complex Networks[J]. Soft Computing, 2021, 25(3): 2467-2482.
doi: 10.1007/s00500-020-05314-8
[21] Mishra S, Singh S, Biswas B. MNERLP-MUL: Merged Node and Edge Relevance Based Link Prediction in Multiplex Networks[J]. Journal of Computational Science, 2022, 60(3): 101606.
doi: 10.1016/j.jocs.2022.101606
[22] Ren H X, Kokai G F, Turner W J, et al. ParaGraph: Layout Parasitics and Device Parameter Prediction Using Graph Neural Networks[C]// Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference. New York: ACM, 2020: 1-6.
[23] 张欣环, 刘宏杰, 施俊庆, 等. 基于时空特征向量的长短期记忆人工神经网络的城市公交旅行时间预测[J]. 计算机应用, 2021, 41(3): 875-880.
doi: 10.11772/j.issn.1001-9081.2020060467
[23] (Zhang Xinhuan, Liu Hongjie, Shi Junqing, et al. LSTM and Artificial Neural Network for Urban Bus Travel Time Prediction Based on Spatiotemporal Eigenvectors[J]. Journal of Computer Applications, 2021, 41(3): 875-880.)
doi: 10.11772/j.issn.1001-9081.2020060467
[24] 李倩, 陈红伶, 许鑫. 基于时间加权A-T模型的学者相似度计算研究[J]. 情报杂志, 2021, 40(9): 170-177.
[24] (Li Qian, Chen Hongling, Xu Xin. Research on Scholars Similarity Calculation Based on Time Weighted A-T Model[J]. Journal of Intelligence, 2021, 40(9): 170-177.)
[25] Anderson J R. Learning and Memory: An Integrated Approach[M]. Wiley, 1999.
[26] 江志恒, 刘乃芩. 论遗忘函数——关于记忆心理学的数学讨论[J]. 心理学动态, 1988(3): 56-60.
[26] (Jiang Zhiheng, Liu Naiqin. On Forgetting Function—A Mathematical Discussion on Memory Psychology[J]. Advances in Psychological Science, 1988(3): 56-60.)
[27] 国际专利分类[EB/OL]. [2022-02-21]. https://www.wipo.int/classifications/ipc/zh/.
[27] (International Patent Classification[EB/OL]. [2022-02-21]. https://www.wipo.int/classifications/ipc/zh/.)
[28] 关于国际专类分类[EB/OL]. [2022-02-21]. https://www.wipo.int/classifications/ipc/zh/preface.html.
[28] (About the International Patent Classification[EB/OL]. [2022-02-21]. https://www.wipo.int/classifications/ipc/zh/preface.html.)
[1] 裴伟, 孙水发, 李小龙, 鲁际, 杨柳, 吴义熔. 融合领域知识的医学命名实体识别研究*[J]. 数据分析与知识发现, 2023, 7(3): 142-154.
[2] 成全, 佘德昕. 融合患者体征与用药数据的图神经网络药物推荐方法研究*[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[3] 张若琦, 申建芳, 陈平华. 结合GNN、Bi-GRU及注意力机制的会话序列推荐*[J]. 数据分析与知识发现, 2022, 6(6): 46-54.
[4] 刘勘, 徐勤亚, 於陆. 面向营商环境的知识图谱构建研究*[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
[5] 王露, 乐小虬. 基于句法依赖增强的主题-问题实例识别方法研究[J]. 数据分析与知识发现, 2022, 6(12): 13-22.
[6] 王洁,高原,张蕾,马力文,冯筠. 基于因果分析图的城市交通流短时预测研究*[J]. 数据分析与知识发现, 2022, 6(11): 111-125.
[7] 顾耀文,郑思,杨丰春,李姣. 基于图神经网络的抗结核杆菌药物虚拟筛选模型的建立及应用*[J]. 数据分析与知识发现, 2022, 6(11): 93-102.
[8] 冯小东, 惠康欣. 基于异构图神经网络的社交媒体文本主题聚类*[J]. 数据分析与知识发现, 2022, 6(10): 9-19.
[9] 黄学坚, 刘雨飏, 马廷淮. 基于改进型图神经网络的学术论文分类模型*[J]. 数据分析与知识发现, 2022, 6(10): 93-102.
[10] 顾耀文, 张博文, 郑思, 杨丰春, 李姣. 基于图注意力网络的药物ADMET分类预测模型构建方法*[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[11] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[12] 陈文杰. 基于翻译模型的科研合作预测研究*[J]. 数据分析与知识发现, 2020, 4(10): 28-36.
[13] 余传明,李浩男,王曼怡,黄婷婷,安璐. 基于深度学习的知识表示研究:网络视角*[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[14] 余传明, 龚雨田, 赵晓莉, 安璐. 基于多特征融合的金融领域科研合作推荐研究*[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
[15] 程翠琼, 徐健. 面向网络游记时间特征的情感分析模型*[J]. 数据分析与知识发现, 2017, 1(2): 87-95.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn