|
|
Constructing Knowledge Graph for Financial Securities and Discovering Related Stocks with Knowledge Association |
Liu Zhenghao1,2,3( ),Qian Yuxing1,2,3,Yi Tianlong1,2,Lv Huakui1,2 |
1School of Information Management, Wuhan University, Wuhan 430072, China 2Institute of Big Data, Wuhan University, Wuhan 430072, China 3Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China |
|
|
Abstract [Objective] This paper constructs domain knowledge graph based on knowledge association and discovers industry characteristics and related stocks, aiming to improve investors’ decision making. [Methods] Firstly, we constructed the “seed” knowledge graph with stock data. Then, we conducted entity extraction and relationship classification with unstructured text data based on FinBERT pre-training model to generate the triples. Third, we merged the seed graph and the triples to create the knowledge graph for financial securities. Fourth, based on the graph, link prediction, similarity calculation and other data mining algorithms, we discovered the related stocks and their hidden characteristics. Our findings were preliminarily verified by statistical methods. [Results] Our new knowledge graph was constructed with 111,845 entities and 163,370 relationships. We analyzed 10 cross-industry stocks having the highest similarity with “Northeast Securities”. We also examined the potential nonlinear correlation between stocks using “Sihuan Biology”. [Limitations] The constructed knowledge graph only included the impacts of static information (e.g., industry and shareholder ownership) on stock correlation. [Conclusions] Our new knowledge graph provides strong data analytics support for investors to make effective portfolio strategies and predict stock trends.
|
Received: 21 June 2021
Published: 14 April 2022
|
|
Fund:National Natural Science Foundation of China(91646206);National Key Research and Development Program of China(2020AAA0108505) |
Corresponding Authors:
Liu Zhenghao,ORCID:0000-0003-1356-7017
E-mail: zhenghaoliu@whu.edu.cn
|
[1] |
范英, 魏一鸣, 应尚军. 金融复杂系统: 模型与实证[M]. 北京: 科学出版社, 2006.
|
[1] |
( Fan Ying, Wei Yiming, Ying Shangjun. Complexity in Financial System: Model and Analysis[M]. Beijing: Science Press, 2006.)
|
[2] |
中国证券业协会. 关于推进证券行业数字化转型发展的研究报告[R]. 中国证券业协会, 2020.
|
[2] |
(Securities Association of China. Research Report on Promoting Digital Transformation Development of Securities Industry[R]. Securities Association of China, 2020.)
|
[3] |
刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J]. 计算机研究与发展, 2016, 53(3):582-600.
|
[3] |
( Liu Qiao, Li Yang, Duan Hong, et al. Knowledge Graph Construction Techniques[J]. Journal of Computer Research and Development, 2016, 53(3):582-600.)
|
[4] |
O’ Riain S, Harth A, Curry E. Linked Data Driven Information Systems as an Enabler for Integrating Financial Data[A]//Information Systems for Global Financial Markets: Emerging Developments and Effects[M]. IGI Global, 2012: 239-270.
|
[5] |
Nam K, Seong N. Financial News-Based Stock Movement Prediction Using Causality Analysis of Influence in the Korean Stock Market[J]. Decision Support Systems, 2019, 117:100-112.
doi: 10.1016/j.dss.2018.11.004
|
[6] |
Yin T, Liu C, Ding F Y, et al. Graph-Based Stock Correlation and Prediction for High-Frequency Trading Systems[J]. Pattern Recognition, 2022, 122(4):1-11.
|
[7] |
张瑞, 唐旭丽, 王定峰, 等. 基于知识关联的金融数据可视化分析[J]. 情报理论与实践, 2018, 41(10):131-136.
|
[7] |
( Zhang Rui, Tang Xuli, Wang Dingfeng, et al. Visualization Analysis of Financial Data Based on Knowledge Association[J]. Information Studies: Theory & Application, 2018, 41(10):131-136.)
|
[8] |
Amit S. Introducing the Knowledge Graph[R]. America: Official Blog of Google, 2012.
|
[9] |
Sekine S. NYU: Description of the Japanese NE System Used for MET-2[C]// Proceedings of the 7th Message Understanding Conference (MUC-7). 1998.
|
[10] |
阮光册, 夏磊. 基于词共现关系的检索结果知识关联研究[J]. 情报学报, 2017, 36(12):1247-1254.
|
[10] |
( Ruan Guangce, Xia Lei. Knowledge Connection of Retrieval Results Based on Co-Word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(12):1247-1254.)
|
[11] |
马费成. 在数字环境下实现知识的组织和提供[J]. 郑州大学学报(哲学社会科学版), 2005, 38(4):5-7.
|
[11] |
( Ma Feicheng. Knowledge Organization and Provision in Digital Environment[J]. Journal of Zhengzhou University (Philosophy and Social Science Edition), 2005, 38(4):5-7.)
|
[12] |
Pan J Z, Vetere G, Gomez-Perez J M, et al. Exploiting Linked Data and Knowledge Graphs in Large Organisations[M]. Cham: Springer International Publishing, 2017.
|
[13] |
Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008:1247-1250.
|
[14] |
Auer S, Bizer C, Kobilarov G, et al. DBpedia: A Nucleus for a Web of Open Data[C]// Proceedings of the 6th International Semantic Web Conference. Berlin: Springer, 2007:722-735.
|
[15] |
Allen F, Gale D. Financial Contagion[J]. Journal of Political Economy, 2000, 108(1):1-33.
doi: 10.1086/262109
|
[16] |
Loster M, Naumann F, Ehmueller J, et al. CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text[C]// Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018: 1883-1886.
|
[17] |
Song D Z, Schilder F, Hertz S, et al. Building and Querying an Enterprise Knowledge Graph[J]. IEEE Transactions on Services Computing, 2019, 12(3):356-369.
doi: 10.1109/TSC.2017.2711600
|
[18] |
Miao R, Zhang X, Yan H F, et al. A Dynamic Financial Knowledge Graph Based on Reinforcement Learning and Transfer Learning[C]// Proceedings of 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019: 5370-5378.
|
[19] |
胡扬, 闫宏飞, 陈翀. 面向金融知识图谱的实体和关系联合抽取算法[J]. 重庆理工大学学报(自然科学), 2020, 34(5):139-149.
|
[19] |
( Hu Yang, Yan Hongfei, Chen Chong. Joint Entity and Relation Extraction for Constructing Financial Knowledge Graph[J]. Journal of Chongqing University of Technology (Natural Science), 2020, 34(5):139-149.)
|
[20] |
Yerashenia N, Bolotov A. Computational Modelling for Bankruptcy Prediction: Semantic Data Analysis Integrating Graph Database and Financial Ontology[C]// Proceedings of 2019 IEEE 21st Conference on Business Informatics (CBI). IEEE, 2019: 84-93.
|
[21] |
吕华揆, 洪亮, 马费成. 金融股权知识图谱构建与应用[J]. 数据分析与知识发现, 2020, 4(5):27-37.
|
[21] |
( Lv Huakui, Hong Liang, Ma Feicheng. Constructing Knowledge Graph for Financial Equities[J]. Data Analysis and Knowledge Discovery, 2020, 4(5):27-37.)
|
[22] |
陈晓军, 向阳. 企业风险知识图谱的构建及应用[J]. 计算机科学, 2020, 47(11):237-243.
|
[22] |
( Chen Xiaojun, Xiang Yang. Construction and Application of Enterprise Risk Knowledge Graph[J]. Computer Science, 2020, 47(11):237-243.)
|
[23] |
Xue Y, Huang L. Research on Internet Financial Fraud Detection Based on Knowledge Graph[C]// Proceedings of the 4th International Symposium - Management, Innovation and Development, 2017:925-931.
|
[24] |
金磐石, 万光明, 沈丽忠. 基于知识图谱的小微企业贷款申请反欺诈方案[J]. 大数据, 2019, 5(4):100-112.
|
[24] |
( Jin Panshi, Wan Guangming, Shen Lizhong, et al. Knowledge Graph-Based Fraud Detection for Small and Micro Enterprise Loans[J]. Big Data Research, 2019, 5(4):100-112.)
|
[25] |
陈强, 代仕娅. 基于金融知识图谱的会计欺诈风险识别方法[J]. 大数据, 2021: 7(3):116-129.
|
[25] |
( Chen Qiang, Dai Shiya. Recognition Method of Accounting Fraud Risk Based on Financial Knowledge Graph[J]. Big Data Research, 2021, 7(3):116-129.)
|
[26] |
Liu Y, Zeng Q G, Ordieres Meré J, et al. Anticipating Stock Market of the Renowned Companies: A Knowledge Graph Approach[J]. Complexity, 2019: Article No.9202457.
|
[27] |
Ren J T, Long J W, Xu Z K. Financial News Recommendation Based on Graph Embeddings[J]. Decision Support Systems, 2019, 125: Article No.113115.
|
[28] |
李倩玉. 面向金融实体的知识图谱构建研究[D]. 昆明: 云南财经大学, 2020.
|
[28] |
( Li Qianyu. Knowledge Graph Construction Research for Financial Entities[D]. Kunming: Yunnan University of Finance and Economics, 2020.)
|
[29] |
Long J W, Chen Z P, He W B, et al. An Integrated Framework of Deep Learning and Knowledge Graph for Prediction of Stock Price Trend: An Application in Chinese Stock Exchange Market[J]. Applied Soft Computing, 2020, 91: Article No. 106205.
|
[30] |
姚定球. 沪深股市相关性的实证研究[D]. 泉州: 华侨大学, 2011.
|
[30] |
( Yao Dingqiu. An Empirical Study on the Correlation Between Shanghai and Shenzhen Stock Markets[D]. Quanzhou: Huaqiao University, 2011.)
|
[31] |
宁苡鹤. 基于相关性的股票价格预测模型研究[D]. 北京: 北京邮电大学, 2018.
|
[31] |
( Ning Yihe. A New Approach to Predict Stock Price by Combination of Artificial Neural Networks and Stock Correlation[D]. Beijing: Beijing University of Posts and Telecommunications, 2018.)
|
[32] |
Granger C W J. Investigating Causal Relations by Econometric Models and Cross-spectral Methods[J]. Econometrica, 1969, 37(3):424-438.
doi: 10.2307/1912791
|
[33] |
Wen F H, Liu Z F. A Copula-based Correlation Measure and Its Application in Chinese Stock Market[J]. International Journal of Information Technology & Decision Making, 2009, 8(4):787-801.
|
[34] |
Frees E W, Valdez E A. Understanding Relationships Using Copulas[J]. North American Actuarial Journal, 1998, 2(1):1-25.
|
[35] |
汪玉环. 股票间相关性测量方法的研究及应用[D]. 哈尔滨: 哈尔滨工业大学, 2017.
|
[35] |
( Wang Yuhuan. Research and Application on the Method of the Correlation Measurement Between Stocks[D]. Harbin: Harbin Institute of Technology, 2017.)
|
[36] |
Mantegna R N. Hierarchical Structure in Financial Markets[J]. The European Physical Journal B- Condensed Matter and Complex Systems, 1999, 11(1):193-197.
doi: 10.1007/s100510050929
|
[37] |
Lee K E, Lee J W, Hong B H . Complex Networks in a Stock Market[J]. Computer Physics Communications, 2007, 177(1/2):186.
doi: 10.1016/j.cpc.2007.02.047
|
[38] |
庄新田, 闵志锋, 陈师阳. 上海证券市场的复杂网络特性分析[J]. 东北大学学报(自然科学版), 2007, 28(7):1053-1056.
|
[38] |
( Zhuang Xintian, Min Zhifeng, Chen Shiyang. Characteristic Analysis of Complex Network for Shanghai Stock Market[J]. Journal of Northeastern University(Natural Science), 2007, 28(7):1053-1056.)
|
[39] |
陈花. 基于复杂网络的股票之间有向相关性研究[D]. 北京: 北京邮电大学, 2012.
|
[39] |
( Chen Hua. Directed Cross-Correlation Research Between Stocks Based on Complex Network[D]. Beijing: Beijing University of Posts and Telecommunications, 2012.)
|
[40] |
黄玮强, 姚爽, 庄新田. 股权分置改革对我国股票关联网络稳定性的影响[J]. 东北大学学报(自然科学版), 2013, 34(11):1669-1672.
|
[40] |
( Huang Weiqiang, Yao Shuang, Zhuang Xintian. Effects of Non-Tradable Shares Reform on the Structural Stability of China’s Stock Cross-Correlation Networks[J]. Journal of Northeastern University (Natural Science), 2013, 34(11):1669-1672.)
|
[41] |
Liben-Nowell D, Kleinberg J. The Link-prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2007, 58(7):1019-1031.
doi: 10.1002/asi.20591
|
[42] |
唐旭丽, 马费成, 傅维刚, 等. 知识关联视角下的金融知识表示及风险识别[J]. 情报学报, 2019, 38(3):286-298.
|
[42] |
( Tang Xuli, Ma Feicheng, Fu Weigang, et al. Research on Financial Knowledge Representation and Risk Identification from Knowledge Connection Perspective[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(3):286-298.)
|
[43] |
Browne O, O’Reilly P, Hutchinson M, et al. Distributed Data and Ontologies: An Integrated Semantic Web Architecture Enabling More Efficient Data Management[J]. Journal of the Association for Information Science and Technology, 2019, 70(6):575-586.
doi: 10.1002/asi.24144
|
[44] |
黄胜, 王博博, 朱菁. 基于文档结构与深度学习的金融公告信息抽取[J]. 计算机工程与设计, 2020, 41(1):115-121.
|
[44] |
( Huang Sheng, Wang Bobo, Zhu Jing. Information Extraction of Financial Announcement Based on Document Structure and Deep Learning[J]. Computer Engineering and Design, 2020, 41(1):115-121.)
|
[45] |
安磊. 构建金融知识图谱的知识抽取服务的设计与实现[D]. 南京: 南京大学, 2019.
|
[45] |
( An Lei. The Design and Implementation of Knowledge Extraction Service for Constructing the Knowledge Graph of the Financial Domain[D]. Nanjing: Nanjing University, 2019.)
|
[46] |
漆桂林, 高桓, 吴天星. 知识图谱研究进展[J]. 情报工程, 2017, 3(1):4-25.
|
[46] |
( Qi Guilin, Gao Huan, Wu Tianxing. The Research Advances of Knowledge Graph[J]. Technology Intelligence Engineering, 2017, 3(1):4-25.)
|
[47] |
Zhong Z X, Chen D Q. A Frustratingly Easy Approach for Joint Entity and Relation Extraction[OL]. arXiv Preprint, arXiv:2010.12812.
|
[48] |
Yang Y, Mark C S, Huang A. FinBERT: A Pretrained Language Model for Financial Communications[OL]. arXiv Preprint, arXiv:2006.08097.
|
[49] |
Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009:1003-1011.
|
[50] |
Zeng D Z, Liu K, Chen Y B, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
|
[51] |
Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.
doi: 10.1016/S0378-8733(03)00009-1
|
[52] |
Markowitz H. Portfolio Selection[J]. Journal of Finance, 1952, 7(1):77-91.
|
[53] |
钟锐. 基于K-Means++与Adaboost弹性网络的多股票配对交易策略设计[D]. 上海: 上海师范大学, 2020.
|
[53] |
( Zhong Rui. Multi-Stock Pair-Trading Strategy Design Based on K-Means ++ and Adaboost Elastic Network[D]. Shanghai: Shanghai Normal University, 2020.)
|
[54] |
吴永环. 股指期货推出对股票市场影响的研究[D]. 北京: 中央民族大学, 2010.
|
[54] |
( Wu Yonghuan. Research on the Impact of the Introduction of Stock Index Futures on the Stock Market[D]. Beijing: Minzu University of China, 2010.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|