Constructing Knowledge Graph for Financial Securities and Discovering Related Stocks with Knowledge Association
Liu Zhenghao1,2,3(),Qian Yuxing1,2,3,Yi Tianlong1,2,Lv Huakui1,2
1School of Information Management, Wuhan University, Wuhan 430072, China 2Institute of Big Data, Wuhan University, Wuhan 430072, China 3Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China
[Objective] This paper constructs domain knowledge graph based on knowledge association and discovers industry characteristics and related stocks, aiming to improve investors’ decision making. [Methods] Firstly, we constructed the “seed” knowledge graph with stock data. Then, we conducted entity extraction and relationship classification with unstructured text data based on FinBERT pre-training model to generate the triples. Third, we merged the seed graph and the triples to create the knowledge graph for financial securities. Fourth, based on the graph, link prediction, similarity calculation and other data mining algorithms, we discovered the related stocks and their hidden characteristics. Our findings were preliminarily verified by statistical methods. [Results] Our new knowledge graph was constructed with 111,845 entities and 163,370 relationships. We analyzed 10 cross-industry stocks having the highest similarity with “Northeast Securities”. We also examined the potential nonlinear correlation between stocks using “Sihuan Biology”. [Limitations] The constructed knowledge graph only included the impacts of static information (e.g., industry and shareholder ownership) on stock correlation. [Conclusions] Our new knowledge graph provides strong data analytics support for investors to make effective portfolio strategies and predict stock trends.
刘政昊, 钱宇星, 衣天龙, 吕华揆. 知识关联视角下金融证券知识图谱构建与相关股票发现*[J]. 数据分析与知识发现, 2022, 6(2/3): 184-201.
Liu Zhenghao, Qian Yuxing, Yi Tianlong, Lv Huakui. Constructing Knowledge Graph for Financial Securities and Discovering Related Stocks with Knowledge Association. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 184-201.
( Fan Ying, Wei Yiming, Ying Shangjun. Complexity in Financial System: Model and Analysis[M]. Beijing: Science Press, 2006.)
[2]
中国证券业协会. 关于推进证券行业数字化转型发展的研究报告[R]. 中国证券业协会, 2020.
[2]
(Securities Association of China. Research Report on Promoting Digital Transformation Development of Securities Industry[R]. Securities Association of China, 2020.)
( Liu Qiao, Li Yang, Duan Hong, et al. Knowledge Graph Construction Techniques[J]. Journal of Computer Research and Development, 2016, 53(3):582-600.)
[4]
O’ Riain S, Harth A, Curry E. Linked Data Driven Information Systems as an Enabler for Integrating Financial Data[A]//Information Systems for Global Financial Markets: Emerging Developments and Effects[M]. IGI Global, 2012: 239-270.
[5]
Nam K, Seong N. Financial News-Based Stock Movement Prediction Using Causality Analysis of Influence in the Korean Stock Market[J]. Decision Support Systems, 2019, 117:100-112.
doi: 10.1016/j.dss.2018.11.004
[6]
Yin T, Liu C, Ding F Y, et al. Graph-Based Stock Correlation and Prediction for High-Frequency Trading Systems[J]. Pattern Recognition, 2022, 122(4):1-11.
( Zhang Rui, Tang Xuli, Wang Dingfeng, et al. Visualization Analysis of Financial Data Based on Knowledge Association[J]. Information Studies: Theory & Application, 2018, 41(10):131-136.)
[8]
Amit S. Introducing the Knowledge Graph[R]. America: Official Blog of Google, 2012.
[9]
Sekine S. NYU: Description of the Japanese NE System Used for MET-2[C]// Proceedings of the 7th Message Understanding Conference (MUC-7). 1998.
( Ruan Guangce, Xia Lei. Knowledge Connection of Retrieval Results Based on Co-Word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(12):1247-1254.)
( Ma Feicheng. Knowledge Organization and Provision in Digital Environment[J]. Journal of Zhengzhou University (Philosophy and Social Science Edition), 2005, 38(4):5-7.)
[12]
Pan J Z, Vetere G, Gomez-Perez J M, et al. Exploiting Linked Data and Knowledge Graphs in Large Organisations[M]. Cham: Springer International Publishing, 2017.
[13]
Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008:1247-1250.
[14]
Auer S, Bizer C, Kobilarov G, et al. DBpedia: A Nucleus for a Web of Open Data[C]// Proceedings of the 6th International Semantic Web Conference. Berlin: Springer, 2007:722-735.
[15]
Allen F, Gale D. Financial Contagion[J]. Journal of Political Economy, 2000, 108(1):1-33.
doi: 10.1086/262109
[16]
Loster M, Naumann F, Ehmueller J, et al. CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text[C]// Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018: 1883-1886.
[17]
Song D Z, Schilder F, Hertz S, et al. Building and Querying an Enterprise Knowledge Graph[J]. IEEE Transactions on Services Computing, 2019, 12(3):356-369.
doi: 10.1109/TSC.2017.2711600
[18]
Miao R, Zhang X, Yan H F, et al. A Dynamic Financial Knowledge Graph Based on Reinforcement Learning and Transfer Learning[C]// Proceedings of 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019: 5370-5378.
( Hu Yang, Yan Hongfei, Chen Chong. Joint Entity and Relation Extraction for Constructing Financial Knowledge Graph[J]. Journal of Chongqing University of Technology (Natural Science), 2020, 34(5):139-149.)
[20]
Yerashenia N, Bolotov A. Computational Modelling for Bankruptcy Prediction: Semantic Data Analysis Integrating Graph Database and Financial Ontology[C]// Proceedings of 2019 IEEE 21st Conference on Business Informatics (CBI). IEEE, 2019: 84-93.
( Lv Huakui, Hong Liang, Ma Feicheng. Constructing Knowledge Graph for Financial Equities[J]. Data Analysis and Knowledge Discovery, 2020, 4(5):27-37.)
( Chen Xiaojun, Xiang Yang. Construction and Application of Enterprise Risk Knowledge Graph[J]. Computer Science, 2020, 47(11):237-243.)
[23]
Xue Y, Huang L. Research on Internet Financial Fraud Detection Based on Knowledge Graph[C]// Proceedings of the 4th International Symposium - Management, Innovation and Development, 2017:925-931.
( Jin Panshi, Wan Guangming, Shen Lizhong, et al. Knowledge Graph-Based Fraud Detection for Small and Micro Enterprise Loans[J]. Big Data Research, 2019, 5(4):100-112.)
( Chen Qiang, Dai Shiya. Recognition Method of Accounting Fraud Risk Based on Financial Knowledge Graph[J]. Big Data Research, 2021, 7(3):116-129.)
[26]
Liu Y, Zeng Q G, Ordieres Meré J, et al. Anticipating Stock Market of the Renowned Companies: A Knowledge Graph Approach[J]. Complexity, 2019: Article No.9202457.
[27]
Ren J T, Long J W, Xu Z K. Financial News Recommendation Based on Graph Embeddings[J]. Decision Support Systems, 2019, 125: Article No.113115.
[28]
李倩玉. 面向金融实体的知识图谱构建研究[D]. 昆明: 云南财经大学, 2020.
[28]
( Li Qianyu. Knowledge Graph Construction Research for Financial Entities[D]. Kunming: Yunnan University of Finance and Economics, 2020.)
[29]
Long J W, Chen Z P, He W B, et al. An Integrated Framework of Deep Learning and Knowledge Graph for Prediction of Stock Price Trend: An Application in Chinese Stock Exchange Market[J]. Applied Soft Computing, 2020, 91: Article No. 106205.
[30]
姚定球. 沪深股市相关性的实证研究[D]. 泉州: 华侨大学, 2011.
[30]
( Yao Dingqiu. An Empirical Study on the Correlation Between Shanghai and Shenzhen Stock Markets[D]. Quanzhou: Huaqiao University, 2011.)
[31]
宁苡鹤. 基于相关性的股票价格预测模型研究[D]. 北京: 北京邮电大学, 2018.
[31]
( Ning Yihe. A New Approach to Predict Stock Price by Combination of Artificial Neural Networks and Stock Correlation[D]. Beijing: Beijing University of Posts and Telecommunications, 2018.)
[32]
Granger C W J. Investigating Causal Relations by Econometric Models and Cross-spectral Methods[J]. Econometrica, 1969, 37(3):424-438.
doi: 10.2307/1912791
[33]
Wen F H, Liu Z F. A Copula-based Correlation Measure and Its Application in Chinese Stock Market[J]. International Journal of Information Technology & Decision Making, 2009, 8(4):787-801.
[34]
Frees E W, Valdez E A. Understanding Relationships Using Copulas[J]. North American Actuarial Journal, 1998, 2(1):1-25.
[35]
汪玉环. 股票间相关性测量方法的研究及应用[D]. 哈尔滨: 哈尔滨工业大学, 2017.
[35]
( Wang Yuhuan. Research and Application on the Method of the Correlation Measurement Between Stocks[D]. Harbin: Harbin Institute of Technology, 2017.)
[36]
Mantegna R N. Hierarchical Structure in Financial Markets[J]. The European Physical Journal B- Condensed Matter and Complex Systems, 1999, 11(1):193-197.
doi: 10.1007/s100510050929
[37]
Lee K E, Lee J W, Hong B H . Complex Networks in a Stock Market[J]. Computer Physics Communications, 2007, 177(1/2):186.
doi: 10.1016/j.cpc.2007.02.047
( Zhuang Xintian, Min Zhifeng, Chen Shiyang. Characteristic Analysis of Complex Network for Shanghai Stock Market[J]. Journal of Northeastern University(Natural Science), 2007, 28(7):1053-1056.)
[39]
陈花. 基于复杂网络的股票之间有向相关性研究[D]. 北京: 北京邮电大学, 2012.
[39]
( Chen Hua. Directed Cross-Correlation Research Between Stocks Based on Complex Network[D]. Beijing: Beijing University of Posts and Telecommunications, 2012.)
( Huang Weiqiang, Yao Shuang, Zhuang Xintian. Effects of Non-Tradable Shares Reform on the Structural Stability of China’s Stock Cross-Correlation Networks[J]. Journal of Northeastern University (Natural Science), 2013, 34(11):1669-1672.)
[41]
Liben-Nowell D, Kleinberg J. The Link-prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2007, 58(7):1019-1031.
doi: 10.1002/asi.20591
( Tang Xuli, Ma Feicheng, Fu Weigang, et al. Research on Financial Knowledge Representation and Risk Identification from Knowledge Connection Perspective[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(3):286-298.)
[43]
Browne O, O’Reilly P, Hutchinson M, et al. Distributed Data and Ontologies: An Integrated Semantic Web Architecture Enabling More Efficient Data Management[J]. Journal of the Association for Information Science and Technology, 2019, 70(6):575-586.
doi: 10.1002/asi.24144
( Huang Sheng, Wang Bobo, Zhu Jing. Information Extraction of Financial Announcement Based on Document Structure and Deep Learning[J]. Computer Engineering and Design, 2020, 41(1):115-121.)
[45]
安磊. 构建金融知识图谱的知识抽取服务的设计与实现[D]. 南京: 南京大学, 2019.
[45]
( An Lei. The Design and Implementation of Knowledge Extraction Service for Constructing the Knowledge Graph of the Financial Domain[D]. Nanjing: Nanjing University, 2019.)
[46]
漆桂林, 高桓, 吴天星. 知识图谱研究进展[J]. 情报工程, 2017, 3(1):4-25.
[46]
( Qi Guilin, Gao Huan, Wu Tianxing. The Research Advances of Knowledge Graph[J]. Technology Intelligence Engineering, 2017, 3(1):4-25.)
[47]
Zhong Z X, Chen D Q. A Frustratingly Easy Approach for Joint Entity and Relation Extraction[OL]. arXiv Preprint, arXiv:2010.12812.
[48]
Yang Y, Mark C S, Huang A. FinBERT: A Pretrained Language Model for Financial Communications[OL]. arXiv Preprint, arXiv:2006.08097.
[49]
Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009:1003-1011.
[50]
Zeng D Z, Liu K, Chen Y B, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
[51]
Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.
doi: 10.1016/S0378-8733(03)00009-1
[52]
Markowitz H. Portfolio Selection[J]. Journal of Finance, 1952, 7(1):77-91.
( Zhong Rui. Multi-Stock Pair-Trading Strategy Design Based on K-Means ++ and Adaboost Elastic Network[D]. Shanghai: Shanghai Normal University, 2020.)
[54]
吴永环. 股指期货推出对股票市场影响的研究[D]. 北京: 中央民族大学, 2010.
[54]
( Wu Yonghuan. Research on the Impact of the Introduction of Stock Index Futures on the Stock Market[D]. Beijing: Minzu University of China, 2010.)