Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (12): 33-42    DOI: 10.11925/infotech.2096-3467.2018.0420
Current Issue | Archive | Adv Search |
Predicting Stock Prices with Text and Price Combined Model
Yu Chuanming1, Gong Yutian1, Wang Feng1, An Lu2()
1School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China
2School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF (761 KB)   HTML ( 5
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to predict stock price fluctuation with the help of big data, aiming to improve the accuracy of the forecasting and reduce the trading risks. [Methods] We proposed a new Text and Price Combined Model (TPCM) to process comments retrieved from a stock forum. Then, we employed deep representation learning algorithm to generate text feature matrix and utilized the K-means clustering method to generate text category. Finally, we used the Multi-Layer Perceptron (MLP) to predict stock price fluctuation based on the opening price, closing price and other 15 original price indicators. [Results] The accuracy of TPCM was 65.91%, which was 7.76% higher than that of the model (58.15%) employing price features only, and 11.37% higher than that of the model (54.54%) employing text features only. [Limitations] The study only used one stock to examine the proposed model. [Conclusions] Stock price forecasting could be improved through the combination of text and price, which creates novel perspectives for future studies.

Key wordsText      Stock Price      Stock Price Fluctuation Prediction      Text and Price Combined Model     
Received: 16 April 2018      Published: 16 January 2019
ZTFLH:  TP391.1  

Cite this article:

Yu Chuanming,Gong Yutian,Wang Feng,An Lu. Predicting Stock Prices with Text and Price Combined Model. Data Analysis and Knowledge Discovery, 2018, 2(12): 33-42.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0420     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I12/33

延迟时间 LSTM Bi-LSTM
1 daylag 46.67% 52.22%
2 daylags 50.56% 44.94%
3 daylags 53.41% 54.54%
4 daylags 45.98% 49.43%
5 daylags 45.34% 43.02%
6 daylags 45.88% 45.88%
7 daylags 45.23% 39.29%
算法 P R F ACC AUC
Price AdaBoosting 45.12% 55.69% 50.89% 56.18% 56.36%
DT 59.86% 58.12% 58.49% 57.90% 53.99%
KNN 56.32% 56.42% 56.36% 54.68% 50.00%
NB 40.58% 47.76% 44.63% 47.73% 52.00%
SVM 54.68% 55.83% 55.46% 54.62% 53.31%
MLP 57.67% 58.22% 58.09% 58.15% 50.00%
Price+Text AdaBoosting 50.65% 51.53% 49.58% 51.21% 57.95%
DT 41.94% 42.05% 41.86% 42.05% 42.06%
KNN 51.17% 51.14% 50.83% 51.14% 51.12%
NB 59.17% 59.09% 59.09% 59.09% 56.03%
SVM 57.37% 56.82% 56.00% 56.82% 55.68%
TPCM(MLP) 66.78% 65.91% 65.46% 65.91% 62.66%
[1] Fama E F.The Behavior of Stock-Market Prices[J]. Journal of Business, 1965, 38(1): 34-105.
doi: 10.1086/294743
[2] 庄树田. 浅谈投资心理和投资行为[J]. 东南大学学报: 哲学社会科学版, 2015, 17(S2): 41, 46.
[2] (Zhuang Shutian. A Preliminary Analysis of Investment Psychology and Investment Behavior[J]. Journal of Southeast University: Philosophy and Social Science, 2015, 17(S2): 41, 46.)
[3] 张健. 近代西欧历史上的泡沫事件及其经济影响[J]. 世界经济与政治论坛, 2010(4): 99-109.
doi: 10.3969/j.issn.1007-1369.2010.04.009
[3] (Zhang Jian.Economic Bubbles in the History of Modern Western Europe and Influences[J]. Forum of World Economics & Politics, 2010(4): 99-109.)
doi: 10.3969/j.issn.1007-1369.2010.04.009
[4] 师萍, 李丽青, 杨洵. 上市公司与审计机构信息披露的博弈模型与实证分析[J]. 管理工程学报, 2004, 18(1): 44-47.
doi: 10.3969/j.issn.1004-6062.2004.01.011
[4] (Shi Ping, Li Liqing, Yang Xun.A Game Theory Analysis Between Public Company and Audit Office in Securities Market[J]. Journal of Industrial and Engineering Management, 2004, 18(1): 44-47.)
doi: 10.3969/j.issn.1004-6062.2004.01.011
[5] 王洪良, 詹奕椿. 上证股市非理性行为的实证分析[J]. 长春大学学报, 2015, 25(7):24-29.
[5] (Wang Hongliang, Zhan Yichun.An Empirical Analysis on the Irrational Behavior of Shanghai Stock Market[J]. Journal of Changchun University, 2015, 25(7): 24-29.)
[6] Nagy J L. Behavioral Economics and the Effects of Psychology on the Stock Market [EB/OL]. [2017-08-30]. .
[7] 邹辉文. 投资者非理性心理行为的综合效应与股价波动[J]. 福州大学学报: 哲学社会科学版, 2008, 22(1): 25-29.
doi: 10.3969/j.issn.1002-3321.2008.01.005
[7] (Zou Huiwen.Combined Effects of Non-rational Trade Behavior of Investors and Fluctuation of Stock Prices[J]. Journal of Fuzhou University: Philosophy and Social Sciences, 2008, 22(1): 25-29.)
doi: 10.3969/j.issn.1002-3321.2008.01.005
[8] 史青春, 徐露莹. 负面舆情对上市公司股价波动影响的实证研究[J]. 中央财经大学学报, 2014(10): 54-62.
[8] (Shi Qingchun, Xu Luying.Empirical Research on the Listed Companies’ Stock Prices Affected by Negative Public Opinion[J]. Journal of Central University of Finance & Economics, 2014(10): 54-62.)
[9] 于瑾, 侯伟相. 杠杆交易、机构投资者行为与资产价格暴跌风险——来自股票市场的证据[J]. 金融监管研究, 2017(12): 17-34.
[9] (Yu Jin, Hou Weixiang.Leveraged Transactions, the Behavior of Institutional Investor and the Risk of Asset Price Crash: Evidences from the Stock Market[J]. Financial Regulation Research, 2017(12): 17-34.)
[10] 岳衡, 赵龙凯. 股票价格中的数字与行为金融[J]. 金融研究, 2007(5): 98-107.
[10] (Yue Heng, Zhao Longkai.Figures and Behavioral Finance in Stock Prices[J]. Journal of Financial Research, 2007(5): 98-107.)
[11] 吴璇, 田高良, 司毅, 等. 网络舆情管理与股票流动性[J]. 管理科学, 2017, 30(6): 51-64.
[11] (Wu Xuan, Tian Gaoliang, Si Yi, et al.Internet Media Management and Stock Liquidity[J]. Journal of Management Science, 2017, 30(6): 51-64.)
[12] 林川. 过度投资、市场情绪与股价崩盘——来自创业板上市公司的经验证据[J]. 中央财经大学学报, 2016(12): 53-64.
[12] (Lin Chuan. ExcessiveInvestment, Market Sentiment and Share Prices Crash: Empirical Evidence from GEM Listed Companies[J]. Journal of Central University of Finance & Economics, 2016(12): 53-64.)
[13] 郭红玉, 许争, 佟捷然. 日本量化宽松政策的特征及对股票市场短期影响研究——基于事件分析法[J]. 国际金融研究, 2016(5): 38-47.
[13] (Guo Hongyu, Xu Zheng, Tong Jieran.The Characteristics of Japan’s Quantitative Easing Policy and Its Short-Term Impact on Stock Market——Based on Event Analysis[J]. Studies of International Finance, 2016(5): 38-47.)
[14] 卢磊. 基于多元回归与技术分析的组合股票价格预测[J]. 上海应用技术学院学报: 自然科学版, 2014, 14(3): 274-276.
doi: 10.3969/j.issn.1671-7333.2014.03.020
[14] (Lu Lei.Combinational Stock Price Forecasting Based on Multiple Regression and Technical Analysis[J]. Journal of Shanghai Institute of Technology: Natural Science, 2014, 14(3): 274-276.)
doi: 10.3969/j.issn.1671-7333.2014.03.020
[15] 陈璐璐. 基于多元线性回归分析的股价预测——以中信银行为例[J]. 经济研究导刊, 2016(19): 75-76.
doi: 10.3969/j.issn.1673-291X.2016.19.032
[15] (Chen Lulu.Based on Multivariate Linear Regression Analysis—— Forecasting Stock Prices in China Citic Bank[J]. Economic Research Guide, 2016(19): 75-76.)
doi: 10.3969/j.issn.1673-291X.2016.19.032
[16] 张建宽, 盛炎平. 支持向量机对股票价格涨跌的预测[J]. 北京信息科技大学学报: 自然科学版, 2017, 32(3): 41-44.
doi: 10.16508/j.cnki.11-5866/n.2017.03.008
[16] (Zhang Jiankuan, Sheng Yanping.Prediction of Stock Price Fluctuation with Support Vector Machine[J]. Journal of Beijing Information Science & Technology University: Natural Science, 2017, 32(3): 41-44.)
doi: 10.16508/j.cnki.11-5866/n.2017.03.008
[17] 黄宏运, 吴礼斌, 李诗争. BP神经网络在股票指数预测中的应用[J]. 通化师范学院学报, 2016, 37(5): 32-34.
doi: 10.13877/j.cnki.cn22-1284.2016.10.011
[17] (Huang Hongyun, Wu Libin, Li Shizheng.Application of Neural Network in Prediction of Stock Index[J]. Journal of Tonghua Normal University, 2016, 37(5): 32-34.)
doi: 10.13877/j.cnki.cn22-1284.2016.10.011
[18] 魏文轩. 改进型RBF神经网络在股票市场预测中的应用[J]. 统计与决策, 2013(15): 70-72.
[18] (Wei Wenxuan.Application of Improved RBF Neural Network in Stock Market Forecasting[J]. Statistics & Decision, 2013(15): 70-72.)
[19] 蔡红, 陈荣耀. 基于PCA-BP神经网络的股票价格预测研究[J]. 计算机仿真, 2011, 28(3):365-368.
doi: 10.3969/j.issn.1006-9348.2011.03.088
[19] (Cai Hong, Chen Rongyao.Stock Price Prediction Based on PCA and BP Neural Network[J]. Computer Simulation, 2011, 28(3): 365-368.)
doi: 10.3969/j.issn.1006-9348.2011.03.088
[20] Göçken M, özçalıcı M, Boru A, et al.Integrating Metaheuristics and Artificial Neural Networks for Improved Stock Price Prediction[J]. Expert Systems with Applications, 2016, 44: 320-331.
doi: 10.1016/j.eswa.2015.09.029
[21] 郭建峰, 李玉, 安东. 基于LM遗传神经网络的短期股价预测[J]. 计算机技术与发展, 2017, 27(1): 152-155.
doi: 10.3969/j.issn.1673-629X.2017.01.034
[21] (Guo Jianfeng, Li Yu, An Dong.Prediction for Short-term Stock Price Based on LM-GA-BP Neural Network[J]. Computer Technology and Development, 2017, 27(1): 152-155.)
doi: 10.3969/j.issn.1673-629X.2017.01.034
[22] Adebiyi A A, Adewumi A, Ayo C.Comparison of ARIMA and Artificial Neural Networks Models for Stock Price Prediction[J]. Journal of Applied Mathematics, 2014(1): 1-7.
doi: 10.1155/2014/614342
[23] Evangelopoulos N, Magro M, Sidorova A.The Dual Micro/Macro Informing Role of Social Network Sites: Can Twitter Macro Messages Help Predict Stock Prices?[J]. Informing Science: The International Journal of an Emerging Transdiscipline, 2012, 15: 247-269.
doi: 10.28945/1739
[24] 王健俊, 殷林森, 叶文靖. 投资者情绪、杠杆资金与股票价格——兼论2015-2016年股灾成因[J]. 金融经济学研究, 2017, 32(1): 85-98.
[24] (Wang Jianjun, Yin Linsen, Ye Wenjing. Investor Sentiment, Leveraged Fund and Stock Price: Reflection on the Cause of Stock Crash in 2015-2016[J]. Financial Economics Research, 2017, 32(1): 85-98.)
[25] 石勇, 唐静, 郭琨. 社交媒体投资者关注、投资者情绪对中国股票市场的影响[J]. 中央财经大学学报, 2017(7): 45-53.
[25] (Shi Yong, Tang Jing, Guo Kun.The Study of Social Media Investor Attention and Sentiment’s Influence on Chinese Stock Market[J]. Journal of Central University of Finance & Economics, 2017(7): 45-53.)
[26] 于琴, 张兵, 虞文微. 新闻情绪是股票收益的幕后推手吗[J]. 金融经济学研究, 2017, 32(6): 95-103.
[26] (Yu Qin, Zhang Bing, Yu Wenwei.Are the Emotions of News a Wire-puller of Stock Returns?[J]. Financial Economics Research, 2017, 32(6): 95-103.)
[27] 董理, 王中卿, 熊德意. 基于文本信息的股票指数预测[J]. 北京大学学报: 自然科学版, 2017, 53(2): 273-278.
doi: 10.13209/j.0479-8023.2017.037
[27] (Dong Li, Wang Zhongqing, Xiong Deyi.Stock Index Prediction Based on Text Information[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 273-278.)
doi: 10.13209/j.0479-8023.2017.037
[28] 黄润鹏, 左文明, 毕凌燕. 基于微博情绪信息的股票市场预测[J]. 管理工程学报, 2015, 29(1): 47-52.
[28] (Huang Runpeng, Zuo Wenming, Bi Lingyan.Predicting the Stock Market Based on Microblog Mood[J]. Journal of Industrial Engineering and Engineering Management, 2015, 29(1): 47-52.)
[29] Yan D F, Zhou J, Zhao X, et al.Predicting Stock Using Microblog Moods[J]. China Communications, 2016, 13(8): 244-257.
doi: 10.1109/CC.2016.7563727
[30] Nguyen T H, Shirai K, Velcin J.Sentiment Analysis on Social Media for Stock Movement Prediction[J]. Expert Systems with Applications, 2015, 42(24): 9603-9611.
doi: 10.1016/j.eswa.2015.07.052
[31] Li X, Xie H, Chen L, et al.News Impact on Stock Price Return via Sentiment Analysis[J]. Knowledge-Based Systems, 2014, 69(1): 14-23.
doi: 10.1016/j.knosys.2014.04.022
[32] 苏治, 卢曼, 李德轩. 深度学习的金融实证应用: 动态、贡献与展望[J]. 金融研究, 2017(5): 111-126.
[32] (Su Zhi, Lu Man, Li Dexuan.Deep Learning in Financial Empirical Application: Dynamics, Contributions and Prospects[J]. Journal of Financial Research, 2017(5): 111-126.)
[33] 韩豫峰, 汪雄剑, 周国富, 等. 中国股票市场是否存在趋势?[J]. 金融研究, 2014(3): 152-163.
[33] (Han Yufeng, Wang Xiongjian, Zhou Guofu, et alAre There Trends in Chinese Stock Market?[J]. Journal of Financial Research, 2014(3): 152-163.)
[34] 金德环, 李岩. 投资者互动与股票收益——来自社交媒体的经验证据[J]. 金融论坛, 2017(5): 72-80.
[34] (Jin Dehuan, Li Yan.Investor Interaction and Stock Returns——Empirical Evidences of Social Media[J]. Finance Forum, 2017(5): 72-80.)
[35] 刘向强, 李沁洋, 孙健. 互联网媒体关注度与股票收益:认知效应还是过度关注[J]. 中央财经大学学报, 2017(7): 54-62.
[35] (Liu Xiangqiang, Li Qinyang, Sun Jian.Internet Media Coverage and Stock Returns: Investor Recognition or Over Attention[J]. Journal of Central University of Finance & Economics, 2017(7): 54-62.)
[36] 段江娇, 刘红忠, 曾剑平. 中国股票网络论坛的信息含量分析[J]. 金融研究, 2017(10): 178-192.
[36] (Duan Jiangjiao, Liu Hongzhong, Zeng Jianping.Analysis on the Information Content of China’s Internet Stock Message Boards[J]. Journal of Financial Research, 2017(10): 178-192.)
[37] 杨晓兰, 沈翰彬, 祝宇. 本地偏好、投资者情绪与股票收益率: 来自网络论坛的经验证据[J]. 金融研究, 2016(12): 143-158.
[37] (Yang Xiaolan, Shen Hanbin, Zhu Yu.The Effect of Local Bias in Investor Attention and Investor Sentiment on Stock Markets: Evidence from Online Forum[J]. Journal of Financial Research, 2016(12): 143-158.)
[38] Huang Y, Qiu H, Wu Z.Local Bias in Investor Attention: Evidence from China’s Internet Stock Message Boards[J]. Journal of Empirical Finance, 2016, 38: 338-354.
doi: 10.2139/ssrn.2050232
[39] Rätsch G, Onoda T, Müller K R.Soft Margins for AdaBoost[J]. Machine Learning, 2001, 42(3): 287-320.
doi: 10.1023/A:1007618119488
[40] Safavian S R, Landgrebe D.A Survey of Decision Tree Classifier Methodology[J]. IEEE Transactions on Systems, Man and Cybernetics, 2002, 21(3): 660-674.
doi: 10.1109/21.97458
[41] Guo G, Wang H, Bell D, et al.KNN Model-Based Approach in Classification[J]. Lecture Notes in Computer Science, 2003, 2888: 986-996.
doi: 10.1007/b94348
[42] Rish I.An Empirical Study of The Naive Bayes Classifier[C]// Proceedings of the 2001 Workshop on Empirical Methods in Artificial Intelligence. 2001, 3(22): 41-46.
[43] Hearst M A, Dumais S T, Osuna E, et al.Support Vector Machines[J]. IEEE Intelligent Systems & Their Applications, 1998, 13(4): 18-28.
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[4] Jiang Yaren, Le Xiaoqiu. Continual Learning for One-to-many Entity Relationship Generation with Small Samples[J]. 数据分析与知识发现, 2021, 5(8): 45-53.
[5] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[6] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[7] Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[8] Xie Hao,Mao Jin,Li Gang. Sentiment Classification of Image-Text Information with Multi-Layer Semantic Fusion[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
[9] Wu Xu,Chen Chunxu. Detecting Topics of Group Chats with Multiple Strategies[J]. 数据分析与知识发现, 2021, 5(5): 1-9.
[10] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[11] Xu Guang,Ren Ming,Song Chengyu. Extracting China’s Economic Image from Western News[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[12] Chen Jun,Liang Hao,Qian Chen. Studying Investment Decisions of Rewarded Crowdfunding Users with Emotional Distance and Text Analysis[J]. 数据分析与知识发现, 2021, 5(4): 60-71.
[13] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[14] Yan Qiang,Zhang Xiaoyan,Zhou Simin. Extracting Keywords Based on Sememe Similarity[J]. 数据分析与知识发现, 2021, 5(4): 80-89.
[15] Yi Huifang,Liu Xiwen. Analyzing Patent Technology Topics with IPC Context-Enhanced Context-LDA Model[J]. 数据分析与知识发现, 2021, 5(4): 25-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn