Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (7): 126-138    DOI: 10.11925/infotech.2096-3467.2020.0907
Current Issue | Archive | Adv Search |
Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model
Xu Yuemei1(),Wang Zihou2,Wu Zixin1
1School of Information Science and Technology, Beijing Foreign Studies of University, Beijing 100089, China
2National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
Download: PDF (1330 KB)   HTML ( 45
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Based on the traditional financial data analysis, this paper explores the impacts of online news on stock market, aiming to improve the accuracy of predicting stock trends. [Methods] First, we used the Convolutional Neural Network (CNN) and Bi-directional Long Short-Term Memory (Bi-LSTM) to extract news events and their sentiment orientations. Then, we proposed a prediction model for stock trends, which combines the stock numerical data and the news event sentiments. Finally, we examined the feasibility of this model with two individual stocks (GREE Electric Appliance in the household appliance industry and ZTE in the electronic appliance industry). [Results] The prediction accuracy of our model was 11.6% and 25.6% higher than the exiting algorithms. [Limitations] We did not evaluate the impacts of prediction period on the performance of the proposed model. [Conclusions] The news events and their sentiment orientations could lead to the fluctuation of stock prices.

Key wordsDeep Learning      Feature Combination      Sentiment Analysis      Stock Trends     
Received: 15 September 2020      Published: 15 April 2021
ZTFLH:  TP393  
Fund:Project of Double Top-Class Foundation of Beijing Foeign Studies University(YY19ZZA012)
Corresponding Authors: Xu Yuemei,ORCID:0000-0002-0223-7146     E-mail: xuyuemei@bfsu.edu.cn

Cite this article:

Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model. Data Analysis and Knowledge Discovery, 2021, 5(7): 126-138.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0907     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I7/126

The Flowchart of Stock Trends Prediction Model Based on Combination of News Event andNews Sentiment Orientation
D 1 D 2 D j D p
T 1 $\bar{d}_{11}$ $\bar{d}_{12}$ $\bar{d}_{1j}$ $\bar{d}_{1p}$
T i $\bar{d}_{i1}$ $\bar{d}_{i2}$ $\bar{d}_{ij}$ $\bar{d}_{ip}$
T n $\bar{d}_{n1}$ $\bar{d}_{n2}$ $\bar{d}_{nj}$ $\bar{d}_{np}$
Feature Matrix of Stock Finance
事件类别 事件名称
交易类 停牌 复牌 资金流入 资金流出 大宗交易 股价倒挂 创新高
股权类 挂牌 借壳 举牌 收购并购 资产重组 资产冻结 股权转让
投融资类 投资 投建 中标 发行债券 发行股票 可转债 募资 质押 分红
公司事务类 注册资本变更 快速发展 战略合作 拓展业务 高管减持或离职
外部事件类 登上龙虎榜 交易所处罚 评级利好 评级下调 政策利好
Part of Categories of News Events
S 1 S 2 S j S q
T 1 s 11 s 12 s 1 j s 1 q
T i s i 1 s i 2 s ij s iq
T n s n 1 s n 2 s nj s nq
Feature Matrix of News Event
Analysis Model of News Sentiment Orientation Based on Bi-LSTM
Sampling Period of Stock Trends Prediction Model
参数 CNN Bi-LSTM
词向量维度 300 300
卷积核个数 96 Null
卷积核大小 3,4,5 Null
Dropout 0.5 0.5
Batch_size 128 128
迭代次数 10 20
标题截取长度 Null 15
单层LSTM神经元个数 Null [256,256]
Parameter Settings of the Prediction Model
数据集

模型
SVM Maxent CNN
训练集 90.8% 72.0% 93.0%
测试集 85.2% 69.4% 87.7%
Experiment Results on News Event Classification
分类性能较好的新闻事件
类型示例
分类性能较差的新闻事件
类型示例
新闻事件 精确率 召回率 F 1 新闻事件 精确率 召回率 F 1
登上龙虎榜 1.00 1.00 1.00 业绩下降 0.64 0.58 0.61
停牌 0.98 1.00 0.99 政策利好 0.81 0.65 0.72
工商变更 1.00 1.00 1.00 资本变更 1.00 0.22 0.36
中标 1.00 1.00 1.00 聘请高管 0.50 0.40 0.44
可转债 0.97 0.97 0.97 业绩增长 0.68 0.73 0.71
质押 1.00 1.00 1.00 预计下滑 0.67 0.61 0.64
交易所问询 0.94 1.00 0.97 利差消息 0.42 0.47 0.44
退市 1.00 1.00 1.00 利好消息 0.46 0.65 0.54
Performance Statistics of News Event Classification
数据集 SVM精确率 Maxent精确率 Bi-LSTM精确率
训练集 86.6% 82.8% 99.0%
测试集 81.1% 76.1% 91.0%
Experiment Results on News Sentiment Classification
股票 采用财务特征的LSTM 引入新闻事件的LSTM 新闻事件/情感融合的
GBDT
新闻事件/情感融合的LSTM
格力电器 0.699 8 0.754 5 0.625 7 0.781 2
中兴通讯 0.646 7 0.785 1 0.654 5 0.812 7
Experiment Results on Stock Trends Prediction
Individual Stock Trends Prediction on GREE Electric Appliance (000651.SZ)
Individual Stock Trends Prediction on ZTE Communication (000063.SZ)
Importance Score of Different Features Based on GBDT
Impact of Threshold Value on Stock Trend Prediction
[1] Sun X Q, Shen H W, Cheng X Q, Trading Network Predicts Stock Price[J]. Scientific Reports, 2014, 4:Article No. 3711.
doi: 10.1038/srep03711
[2] Adebiyi A A, Adewumi A O, Ayo C K. Comparison of ARIMA and Artificial Neural Networks Models for Stock Price Prediction[J]. Journal of Applied Mathematics, 2014: Article No. 614342.
[3] Birz G. Stale Economic News, Media and the Stock Market[J]. Journal of Economic Psychology, 2017, 61(3):384-412.
[4] Chong E, Han C, Park F C. Deep Learning Networks for Stock Market Analysis and Prediction: Methodology, Data Representations, and Case Studies[J]. Expert Systems with Applications, 2017, 83:187-205.
doi: 10.1016/j.eswa.2017.04.030
[5] Akita R, Yoshihara A, Matsubara T. Deep Learning for Stock Prediction Using Numerical and Textual Information[C]// Proceedings of IEEE/ACIS 15th International Conference on Computer and Information Science. IEEE, 2016: 978-984.
[6] Usmani M, Adil S H, Raza K. Stock Market Prediction Using Machine Learning Classifiers and Social Media, News[C]// Proceedings of the 3rd International Conference on Computer and Information Sciences. IEEE, 2016: 322-327.
[7] Bollen J, Mao H, Zeng X. Twitter Mood Predicts the Stock Market[J]. Journal of Computational Science, 2011, 2(1):1-8.
doi: 10.1016/j.jocs.2010.12.007
[8] 李洁, 林永峰. 基于多时间尺度RNN的时序数据预测[J]. 计算机应用与软件, 2018, 35(7):33-37, 62.
[8] (Li Jie, Lin Yongfeng. Prediction of Time Series Data Based on Multi-time Scale RNN[J]. Computer Application and Software, 2018, 35(7):33-37, 62.)
[9] Zhang Y S, Yang S T. Prediction on the Highest Price of the Stock Based on PSO-LSTM Neural Network[C]// Proceedings of the 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE). 2019: 1565-1569.
[10] Althelaya K A, El-Alfy E M, Mohammed S. Stock Market Forecast Using Multivariate Analysis with Bidirectional and Stacked (LSTM, GRU)[C]// Proceedings of the 21st Saudi Computer Society National Computer Conference (NCC). 2018:1301-1307.
[11] Zhao Z Y, Rao R N, Tu S X, et al. Time-Weighted LSTM Model with Redefined Labeling for Stock Trend Prediction[C]// Proceedings of the 29th International Conference on Tools with Artificial Intelligence. 2017: 1210-1217.
[12] 孔翔宇, 毕秀春, 张曙光. 财经新闻与股市预测—基于数据挖掘技术的实证分析[J]. 数理统计与管理, 2016, 35(2):215-224.
[12] (Kong Xiangyu, Bi Xiuchun, Zhang Shuguang. Financial News and Prediction for Stock Market: An Empirical Analysis Based on Data Mining Techniques[J]. Journal of Applied Statistics and Management, 2016, 35(2):215-224.)
[13] Oncharoen P, Vateekul P. Deep Learning for Stock Market Prediction Using Event Embedding and Technical Indicators[C]// Proceeding of the 5th International Conference on Advanced Informatics: Concept Theory and Applications. 2018: 19-24.
[14] Tsai C F, Lin Y C, Yen D C, et al. Predicting Stock Returns by Classifier Ensembles[J]. Applied Soft Computing, 2011, 11(2):2452-2459.
doi: 10.1016/j.asoc.2010.10.001
[15] 张梦吉, 杜婉钰, 郑楠. 引入新闻短文本的个股走势预测模型[J]. 数据分析与知识发现, 2019, 3(5):11-17.
[15] (Zhang Mengji, Du Wanyu, Zheng Nan. Predicting Stock Trends Based on News Event[J]. Data Analysis and Knowledge Discovery, 2019, 3(5):11-17.)
[16] Luan Y D, Lin S F. Research on Text Classification Based on CNN and LSTM[C]// Proceedings of 2019 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). 2019: 435-442.
[17] Yadav A, Vishwakarma D K. Sentiment Analysis Using Deep Learning Architectures: A Review[J]. Artificial Intelligence Review, 2020, 53:4335-4385.
doi: 10.1007/s10462-019-09794-5
[18] Rapach D E, Zhou G F. Forecasting Stock Returns[J]. Handbook of Economic Forecasting, 2013, 2:327-383.
[19] Chen K, Zhou Y, Dai F Y. A LSTM-Based Method for Stock Returns Prediction: A Case Study of China Stock Market[C]// Proceeding of 2015 IEEE International Conference on Big Data. 2015: 2823-2824.
[20] Gandhmal D P, Kumar K. Systematic Analysis and Review of Stock Market Prediction Techniques[J]. Computer Science Review, 2019, 34:100190.
doi: 10.1016/j.cosrev.2019.08.001
[21] 张玉川, 张作泉, 黄珍. 支持向量机在选择优质股票中的应用[J]. 统计与决策, 2008(4):163-165.
[21] (Zhang Yuchuan, Zhang Zuoquan, Huang Zhen. Application of Support Vector Machine in Selecting High Quality Stock[J]. Statistics and Decision, 2008(4):163-165.)
[22] Kara Y, Boyacioglu M A, Bayken K. Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines: The Sample of the Istanbul Stock Exchange[J]. Expert Systems with Applications, 2011, 38(5):5311-5319.
doi: 10.1016/j.eswa.2010.10.027
[23] Wilkinson N. An Introduction to Behavioral Economics[M]. Palgrave Macmillan, 2008.
[24] Bruce J V, Adrian G, Geoff H. Do News and Sentiment Play a Role in Stock Price Prediction?[J]. Applied Intelligence, 2019, 49:3815-3820.
doi: 10.1007/s10489-019-01458-9
[25] Chen W, Zhang Y, Yeo C K, et al. Stock Market Prediction Using Neural Networks Through News on Online Social Networks[C]// Proceeding of 2017 International Smart Cities Conference. 2017:23-29.
[26] Vargas M R, dos Anjos C E M, Bichara G L G, et al. Deep Learning for Stock Market Prediction Using Technical Indicators and Financial News Articles[C]// Proceeding of 2018 International Joint Conference on Neural Networks. DOI: 10.1109/IJCNN.2018.8489208.
doi: 10.1109/IJCNN.2018.8489208
[27] 岑咏华, 谭志浩, 吴承尧. 财经媒介信息对股票市场的影响研究: 基于情感分析的实证[J]. 数据分析与知识发现, 2019, 3(9):98-114.
[27] (Cen Yonghua, Tan Zhihao, Wu Chengyao. Impact of Financial Media Information on Stock Market: An Empirical Study of Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2019, 3(9):98-114.)
[28] Zhao W T, Wu F, Fu Z Q, et al. Sentiment Analysis on Weibo Platform for Stock Prediction[C]// Proceedings of the 6th International Conference on Artificial Intelligence and Security (ICAIS). 2020: 323-333.
[29] Basu S. The Investment Performance of Common Stocks in Relation to Their Price/Earnings Ratio: A Test of the Efficient Market Hypojournal[J]. The Journal of Finance, 1977, 32(3):663-682.
doi: 10.1111/j.1540-6261.1977.tb01979.x
[30] French K R, Fama E F. Size and Book-to-Market Factors in Earnings and Returns[J]. The Journal of Finance, 1995, 50(1):131-155.
doi: 10.1111/j.1540-6261.1995.tb05169.x
[31] Cheadle C, Vawter M P, Freed W J, et al. Analysis of Microarray Data Using Z Score Transformation[J]. The Journal of Molecular Diagnostics, 2003, 5(2):73-81.
doi: 10.1016/S1525-1578(10)60455-2
[32] 徐月梅, 刘韫文, 蔡连侨. 基于深度融合特征的政务微博转发规模预测模型[J]. 数据分析与知识发现, 2020, 4(2/3):18-28.
[32] (Xu Yuemei, Liu Yunwen, Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-Combined Features[J]. Data Analysis and Knowledge Discovery, 2020, 4(2/3):18-28.)
[33] Staudemeyer R C, Morris E R. Understanding LSTM - A Tutorial into Long Short-Term Memory Recurrent Neural Networks[OL]. arXiv Preprint, arXiv: 1909. 09586.
[34] Friedman J H. Greedy Function Approximation: A Gradient Boosting Machine[J]. The Annals of Statistics, 2001, 29(5):1189-1232.
doi: 10.1214/aos/1013203450
[1] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[2] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[3] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[4] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[5] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[6] Liu Tong,Liu Chen,Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[7] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[8] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[9] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[10] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[11] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[12] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[13] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[14] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[15] Zhang Mengyao, Zhu Guangli, Zhang Shunxiang, Zhang Biao. Grouping Microblog Users of Trending Topics Based on Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn