Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (9): 39-50     https://doi.org/10.11925/infotech.2096-3467.2022.0921
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
面向金融知识图谱的动态关系预测方法研究*
张志剑(),倪珍妮,刘政昊,夏苏迪
武汉大学信息资源研究中心 武汉 430072
武汉大学信息管理学院 武汉 430072
武汉大学大数据研究院 武汉 430072
Predicting Dynamic Relationship for Financial Knowledge Graph
Zhang Zhijian(),Ni Zhenni,Liu Zhenghao,Xia Sudi
Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China
School of Information Management, Wuhan University, Wuhan 430072, China
Big Data Institute, Wuhan University, Wuhan 430072, China
全文: PDF (1302 KB)   HTML ( 13
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】提出一种数据驱动的动态关系预测方法,为金融知识图谱的快速更新方法研究提供新视角。【方法】根据监测列表和检索策略在互联网爬取相关信息,使用掩码语言建模任务构建数据集并训练模型;提取金融知识图谱的层级结构搭建神经网络的隐藏层,隐藏层所含的神经元表示命名实体,隐藏层之间使用关系矩阵连接,通过对连接矩阵更新实现对关系的动态预测。【结果】以“宝万之争”事件初期的两次股权变更为例,本文方法可以在不同时期快速捕捉金融图谱中对应实体间关系的变化,验证了方法的有效性。【局限】受限于自监督学习的特性,所预测的关系较为发散,仍需人工进行校准核验。【结论】本文所提方法在数据充分的情况下,无需人工标注即可获取实体间关系的变化,可以对金融知识图谱的关系进行高效持续的预测。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
张志剑
倪珍妮
刘政昊
夏苏迪
关键词 知识图谱关系预测自监督学习    
Abstract

[Objective] This paper proposes a data-driven prediction method for dynamic relationships, aiming to provide a new perspective for rapidly updating the financial knowledge graph. [Methods] First, we regularly crawled relevant information from the Internet according to the monitoring list. Then, we used the Mask Language Model to construct a dataset and train the model. Third, we extracted the hierarchical structure of the financial knowledge graph to build a hidden layer of the neural network. The neurons contained in the hidden layer represent named entities. Fourth, we connected the hidden layers by a relationship matrix and predicted the dynamic relationships by updating the connection matrix. [Results] We examined the proposed model with the two equity changes at the beginning of the “Baowan” event. Our new model quickly captured the changes in the relationship between corresponding entities of the financial graph in different periods. [Limitations] Due to the characteristics of unsupervised learning, the predicted relationship is relatively divergent, which requires manual calibration verification. [Conclusions] With sufficient data, the proposed method can effectively capture the changes in the relationship between entities without manual annotation. It will effectively and continuously predict the relationship of the financial knowledge graph.

Key wordsKnowledge Graph    Relationship Prediction    Self-Supervised Learning
收稿日期: 2022-08-31      出版日期: 2023-10-24
ZTFLH:  G256  
基金资助:*国家自然科学基金重大研究计划重点支持项目(91646206);“科技创新2030 —‘新一代人工智能’”重大项目(2020AAA0108505)
通讯作者: 张志剑,ORCID:0000-0002-7758-9277,E-mail:zzjian@whu.edu.cn。   
引用本文:   
张志剑, 倪珍妮, 刘政昊, 夏苏迪. 面向金融知识图谱的动态关系预测方法研究*[J]. 数据分析与知识发现, 2023, 7(9): 39-50.
Zhang Zhijian, Ni Zhenni, Liu Zhenghao, Xia Sudi. Predicting Dynamic Relationship for Financial Knowledge Graph. Data Analysis and Knowledge Discovery, 2023, 7(9): 39-50.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0921      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I9/39
Fig.1  面向金融知识图谱的动态关系预测模型
Fig.2  金融本体库
Fig.3  全连接结构
模型名称 参数名词 参数说明 参数值
TransE learning_rate 学习率 0.01
embedding_dim 生成知识向量的维度大小 128
margin 正负样本之间的距离 1
batch_size 每个训练批次的数据量 32
max_epochs 最大训练轮次 200
KGANN learning_rate 学习率 0.001
batch_size 每个训练批次的数据量 16
optimizer 优化器 Adam
dropout_ratio 随机使神经元停止运算的比例 0.4
max_length 句子的最大输入长度 120
max_epochs 最大训练轮次 200
Table 1  模型相关参数设置
股东顺序 股东名称 持股数 持股比例
1 华润股份有限公司 1 645 494 720 14.91%
2 HKSCC NOMINEES LIMITED 1 314 939 877 11.91%
3 国信证券-工商银行-国信金鹏分级1号集合资产管理计划 364 036 073 3.30%
4 安邦人寿保险股份有限公司-稳健型投资组合 234 552 728 2.13%
5 GIC PRIVATE LIMITED 145 335 765 1.32%
Table 2  万科股东情况
Fig.4  更新后与现存关系的余弦相似度占比
[1] 王爱萍, 胡海峰. 新发展阶段我国金融风险的新特点、新挑战及防范对策[J]. 人文杂志, 2021(12): 99-108.
[1] (Wang Aiping, Hu Haifeng. New Characteristics, New Challenges of China’s Financial Risk and the Countermeasures in the New Development Stage[J]. The Journal of Humanities, 2021(12): 99-108.)
[2] 政府工作报告[EB/OL]. [2022-03-12]. https://www.gov.cn/gongbao/content/2023/content_5747260.htm.
[2] (The Government Work Report[EB/OL]. [2022-03-12]. https://www.gov.cn/gongbao/content/2023/content_5747260.htm.)
[3] 党印, 苗子清, 张涛, 等. 大数据方法在系统性金融风险监测预警中的应用进展[J]. 金融发展研究, 2022(2): 3-12.
[3] (Dang Yin, Miao Ziqing, Zhang Tao, et al. Application Progress of Big Data Methods in Systemic Financial Risk Monitoring and Early Warning[J]. Journal of Financial Development Research, 2022(2): 3-12.)
[4] 黄恒琪, 于娟, 廖晓, 等. 知识图谱研究综述[J]. 计算机系统应用, 2019, 28(6): 1-12.
[4] (Huang Hengqi, Yu Juan, Liao Xiao, et al. Review on Knowledge Graphs[J]. Computer Systems & Applications, 2019, 28(6): 1-12.)
[5] Ji S X, Pan S R, Cambria E, et al. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(2): 494-514.
doi: 10.1109/TNNLS.2021.3070843
[6] 张宁豫, 陈曦, 陈矫彦, 等. 基于位置的知识图谱链接预测[J]. 中文信息学报, 2018, 32(4): 80-86.
[6] (Zhang Ningyu, Chen Xi, Chen Jiaoyan, et al. Location Based Link Prediction for Knowledge Graph[J]. Journal of Chinese Information Processing, 2018, 32(4): 80-86.)
[7] Nayyeri M, Cil G M, Vahdati S, et al. Trans4E: Link Prediction on Scholarly Knowledge Graphs[J]. Neurocomputing, 2021, 461: 530-542.
doi: 10.1016/j.neucom.2021.02.100
[8] 李纲, 王施运, 毛进, 等. 面向态势感知的国家安全事件图谱构建研究[J]. 情报学报, 2021, 40(11): 1164-1175.
[8] (Li Gang, Wang Shiyun, Mao Jin, et al. Construction of National Security Event Map and Its Application for Situation Awareness[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(11): 1164-1175.)
[9] 陶玥, 余丽, 吴振新. CoTransH: 科技文献知识图谱中语义关系预测的翻译模型[J]. 情报理论与实践, 2021, 44(11): 187-196.
doi: 10.16353/j.cnki.1000-7490.2021.11.025
[9] (Tao Yue, Yu Li, Wu Zhenxin. CoTransH: A Translation Model for Semantic Relation Prediction in the Knowledge Graph of Scientific Articles[J]. Information Studies: Theory & Application, 2021, 44(11): 187-196.)
doi: 10.16353/j.cnki.1000-7490.2021.11.025
[10] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[10] (Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 29-44.)
[11] 张志剑, 刘政昊, 马费成. 面向互联网舆情事件的企业风险识别——基于KGANN模型[J]. 工程管理科技前沿, 2022, 41(1): 65-73.
[11] (Zhang Zhijian, Liu Zhenghao, Ma Feicheng. Enterprise Risk Identification for Internet Public Opinion Events—Based on KGANN Model[J]. Frontiers of Science and Technology of Engineering Management, 2022, 41(1): 65-73.)
[12] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[13] Keshavarzi A, Kannan N, Kochut K. RegPattern2Vec: Link Prediction in Knowledge Graphs[C]// Proceedings of the 2021 IEEE International IOT, Electronics and Mechatronics Conference. IEEE, 2021: 1-7.
[14] Richards B L, Mooney R J. First-Order Theory Revision[A]//BirnbaumL A, CollinsG C. Machine Learning Proceedings 1991[M]. Amsterdam: Elsevier, 1991: 447-451.
[15] Galárraga L A, Teflioudi C, Hose K, et al. AMIE: Association Rule Mining Under Incomplete Evidence in Ontological Knowledge Bases[C]// Proceedings of the 22nd International Conference on World Wide Web. New York: ACM, 2013: 413-422.
[16] Mitchell T, Fredkin E. Never-Ending Language Learning[C]// Proceedings of the IEEE International Conference on Big Data. IEEE, 2015.
[17] Richardson M, Domingos P. Markov Logic: A Unifying Framework for Statistical Relational Learning[C]// Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and Its Connections to Other Fields. 2004:339-371.
[18] 封皓君, 段立, 张碧莹. 面向知识图谱的知识推理综述[J]. 计算机系统应用, 2021, 30(10): 21-30.
[18] (Feng Haojun, Duan Li, Zhang Biying. Overview on Knowledge Reasoning for Knowledge Graph[J]. Computer Systems & Applications, 2021, 30(10): 21-30.)
[19] Lao N, Cohen W W. Relational Retrieval Using a Combination of Path-Constrained Random Walks[J]. Machine Learning, 2010, 81(1): 53-67.
doi: 10.1007/s10994-010-5205-8
[20] Gardner M, Talukdar P, Krishnamurthy J, et al. Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. PA, USA: ACL, 2014: 397-406.
[21] Xiong W H, Hoang T, Wang W Y. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. PA, USA: ACL, 2017: 564-573.
[22] 林泽斐, 欧石燕. 融合结构与文本特征的知识图谱关系预测方法研究[J]. 图书情报工作, 2020, 64(21): 99-110.
doi: 10.13266/j.issn.0252-3116.2020.21.013
[22] (Lin Zefei, Ou Shiyan. Research on Relation Prediction in Knowledge Graphs by Fusing Structure and Text Features[J]. Library and Information Service, 2020, 64(21): 99-110.)
doi: 10.13266/j.issn.0252-3116.2020.21.013
[23] Schlichtkrull M, Kipf T N, Bloem P, et al. Modeling Relational Data with Graph Convolutional Networks[C]// Proceedings of the 15th European Semantic Web Conference. Cham: Springer, 2018: 593-607.
[24] Nathani D, Chauhan J, Sharma C, et al. Learning Attention-Based Embeddings for Relation Prediction in Knowledge Graphs[OL]. arXiv Preprint, arXiv: 1906.01195.
[25] Li M L, Jia Y T, Wang Y Z, et al. Hierarchy-Based Link Prediction in Knowledge Graphs[C]// Proceedings of the 25th International Conference Companion on World Wide Web. New York: ACM, 2016: 77-78.
[26] Bordes A, Usunier N, Garcia-Durán A, et al. Translating Embeddings for Modeling Multi-Relational Data[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. New York: ACM, 2013: 2787-2795.
[27] Wang Z, Zhang J W, Feng J L, et al. Knowledge Graph Embedding by Translating on Hyperplanes[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence. New York: ACM, 2014: 1112-1119.
[28] Lin Y K, Liu Z Y, Sun M S, et al. Learning Entity and Relation Embeddings for Knowledge Graph Completion[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. New York: ACM, 2015: 2181-2187.
[29] Nickel M, Tresp V, Kriegel H P. A Three-Way Model for Collective Learning on Multi-Relational Data[C]// Proceedings of the 28th International Conference on Machine Learning. New York: ACM, 2011: 809-816.
[30] Liu H X, Wu Y X, Yang Y M. Analogical Inference for Multi-Relational Embeddings[C]// Proceedings of the 34th International Conference on Machine Learning. 2017: 2168-2178.
[31] Yang B S, Yih W T, He X D, et al. Embedding Entities and Relations for Learning and Inference in Knowledge Bases[OL]. arXiv Preprint, arXiv: 1412.6575.
[32] Nickel M, Rosasco L, Poggio T. Holographic Embeddings of Knowledge Graphs[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. New York: ACM, 2016: 1955-1961.
[33] Trouillon T, Welbl J, Riedel S, et al. Complex Embeddings for Simple Link Prediction[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016: 2071-2080.
[34] 国家统计局. 《统计上大中小微型企业划分办法》[J]. 轻工标准与质量, 2011(5):24.
[34] (National Bureau of Statistics. Division Method of Large, Medium, Small and Micro Enterprise[J]. Standard & Quality of Light Industry, 2011(5):24.)
[35] 韩毅, 乔林波, 李东升, 等. 知识增强型预训练语言模型综述[J]. 计算机科学与探索, 2022, 16(7): 1439-1461.
doi: 10.3778/j.issn.1673-9418.2108105
[35] (Han Yi, Qiao Linbo, Li Dongsheng, et al. Review of Knowledge-Enhanced Pre-Trained Language Models[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1439-1461.)
doi: 10.3778/j.issn.1673-9418.2108105
[36] Rosenblatt F. The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain[J]. Psychological Review, 1958, 65(6): 386-408.
doi: 10.1037/h0042519 pmid: 13602029
[37] 郑洁, 黄辉, 秦永彬. 一种融合法律知识的相似案例匹配模型[J]. 数据分析与知识发现, 2022, 6(7): 99-106.
[37] (Zheng Jie, Huang Hui, Qin Yongbin. Matching Similar Cases with Legal Knowledge Fusion[J]. Data Analysis and Knowledge Discovery, 2022, 6(7): 99-106.)
[38] 刘长喜, 吴明星. 政治与市场的互构——宝万之争的并购逻辑转变研究[J]. 社会学评论, 2021, 9(1): 103-124.
[38] (Liu Changxi, Wu Mingxing. Mutual Construction of Politics and Market: A Research on the Transitional Logic of M & A Between Baoneng Group and Vanke[J]. Sociological Review of China, 2021, 9(1): 103-124.)
[39] 汤晓东, 郑博宏, 罗皓亮. “宝万之争”的脉络、焦点与研究目的[J]. 清华金融评论, 2016(S1): 27-35.
[39] (Tang Xiaodong, Zheng Bohong, Luo Haoliang. The Context, Focus and Research Purpose of “the Dispute Between Bao and Wan”[J]. Tsinghua Financial Review, 2016(S1): 27-35.)
[1] 翟东升, 娄莹, 阚慧敏, 何喜军, 梁国强, 马自飞. 基于多源异构数据的中医药知识图谱构建与应用研究*[J]. 数据分析与知识发现, 2023, 7(9): 146-158.
[2] 普祥和, 王红斌, 线岩团. 结合类型感知注意力的少样本知识图谱补全*[J]. 数据分析与知识发现, 2023, 7(9): 51-63.
[3] 汪晓凤, 孙雨洁, 王华珍, 张恒彰. 融合深度学习和知识图谱的类型可控问句生成模型构建及验证*[J]. 数据分析与知识发现, 2023, 7(6): 26-37.
[4] 李锴君, 牛振东, 时恺泽, 邱萍. 基于学术知识图谱及主题特征嵌入的论文推荐方法*[J]. 数据分析与知识发现, 2023, 7(5): 48-59.
[5] 王寅秋, 虞为, 陈俊鹏. 融合知识图谱的中文医疗问答社区自动问答研究*[J]. 数据分析与知识发现, 2023, 7(3): 97-109.
[6] 杜悦, 常志军, 董美, 钱力, 王颖. 一种面向海量科技文献数据的大规模知识图谱构建方法*[J]. 数据分析与知识发现, 2023, 7(2): 141-150.
[7] 张贞港, 余传明. 基于实体与关系融合的知识图谱补全模型研究*[J]. 数据分析与知识发现, 2023, 7(2): 15-25.
[8] 彭成, 张春霞, 张鑫, 郭倞涛, 牛振东. 基于实体多元编码的时序知识图谱推理*[J]. 数据分析与知识发现, 2023, 7(1): 138-149.
[9] 张晗, 安欣宇, 刘春鹤. 基于多源语义知识图谱的药物知识发现:以药物重定位为实证*[J]. 数据分析与知识发现, 2022, 6(7): 87-98.
[10] 刘春江, 李姝影, 胡汗林, 方曙. 图数据库在复杂网络分析中的研究与应用进展*[J]. 数据分析与知识发现, 2022, 6(7): 1-11.
[11] 刘勘, 徐勤亚, 於陆. 面向营商环境的知识图谱构建研究*[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
[12] 张卫, 王昊, 陈玥彤, 范涛, 邓三鸿. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[13] 刘政昊, 钱宇星, 衣天龙, 吕华揆. 知识关联视角下金融证券知识图谱构建与相关股票发现*[J]. 数据分析与知识发现, 2022, 6(2/3): 184-201.
[14] 程子佳, 陈翀. 面向流行性疾病科普的用户问题理解与答案内容组织*[J]. 数据分析与知识发现, 2022, 6(2/3): 202-211.
[15] 侯党, 傅湘玲, 高嵩峰, 彭雷, 王友军, 宋美琦. 基于企业知识图谱的企业关联关系挖掘*[J]. 数据分析与知识发现, 2022, 6(2/3): 212-221.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn