Constructing Sentiment Dictionary with Deep Learning: Case Study of Financial Data
Hu Jiaheng1, Cen Yonghua1(), Wu Chengyao2
1School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China 2College of Finance, Nanjing Agricultural University, Nanjing 210095, China
[Objective] This paper proposes a new method to construct a working sentiment dictionary for sentiment analysis in the field of finance. [Methods] Our method built a sentiment dictionary based on the characteristics of corpus and knowledge base. It also mapped the textual information into vector space using word vector method. With the help of existing general sentiment dictionary, we automatically indexed the training corpus, and created training and forecasting sets with a ratio of 9: 1. Finally, we used Python to establish the neural network classifier of deep learning, and evaluated the emotional polarity of the candidate words in the new dictionary. [Results] The accuracy of the proposed neural network classifier with the training set was 95.02%, while the accuracy with the forecasting set was 95.00%. Our results are better than the existing models. [Limitations] The method of extracting seed words could be further optimized. [Conclusions] The proposed method increases the size of corpus to train the neural network classifiers more effectively. It also extracts the emotion information from the semantic relevance of word vectors. The new sentiment dictionary provides possible directions for future research.
胡家珩, 岑咏华, 吴承尧. 基于深度学习的领域情感词典自动构建*——以金融领域为例[J]. 数据分析与知识发现, 2018, 2(10): 95-102.
Hu Jiaheng,Cen Yonghua,Wu Chengyao. Constructing Sentiment Dictionary with Deep Learning: Case Study of Financial Data. Data Analysis and Knowledge Discovery, 2018, 2(10): 95-102.
Smailović J, Grčar M, Lavrač N, et al.Stream-based Active Learning for Sentiment Analysis in the Financial Domain[J]. Information Sciences, 2014, 285(C): 181-203.
doi: 10.1016/j.ins.2014.04.034
[2]
Li X, Xie H, Chen L, et al.News Impact on Stock Price Return via Sentiment Analysis[J]. Knowledge-Based Systems, 2014, 69: 14-23.
doi: 10.1016/j.knosys.2014.04.022
[3]
Nguyen T H, Shirai K, Velcin J.Sentiment Analysis on Social Media for Stock Movement Prediction[J]. Expert Systems with Applications, 2015, 42(24): 9603-9611.
doi: 10.1016/j.eswa.2015.07.052
[4]
Wu D D, Zheng L, Olson D L.A Decision Support Approach for Online Stock Forum Sentiment Analysis[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2014, 44(8): 1077-1087.
doi: 10.1109/TSMC.2013.2295353
(Wang Ke, Xia Rui.A Survey on Automatical Construction Methods of Sentiment Lexicons[J]. Acta Automatica Sinica, 2016, 42(4): 495-511. )
doi: 10.16383/j.aas.2016.c150585
[6]
Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
[7]
Strapparava C, Valitutti A.WordNet Affect: An Affective Extension of WordNet[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004.
[8]
Kamps J, Marx M, Mokken R, et al.Using WordNet to Measure Semantic Orientations of Adjectives[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004.
[9]
Hassan A, Abu-Jbara A, Jha R, et al.Identifying the Semantic Orientation of Foreign Words[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011.
(Liu Weiping, Zhu Yanhui, Li Chunliang, et al.Research on Building Chinese Basic Semantic Lexicon[J]. Journal of Computer Applications, 2009, 29(10): 2875-2877. )
[11]
Andreevskaia A, Bergler S.Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses[C]// Proceedings the 11th Conference of the European Chapter of the Association for Computational Linguistics. 2006.
[12]
Esuli A, Sebastiani F.Pageranking WordNet Synsets: An Application to Opinion Mining[C]// Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007.
[13]
Kanayama H, Nasukawa T.Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006.
[14]
Xia Y, Cambria E, Hussain A, et al.Word Polarity Disambiguation Using Bayesian Model and Opinion-level Features[J]. Cognitive Computation, 2015, 7(3): 369-380.
doi: 10.1007/s12559-014-9298-4
(Yin Chunxia, Peng Qinke.Identifying Word Sentiment Orientation for Free Comments via Complex Network[J]. Acta Automatica Sinica, 2012, 38(3): 389-398.)
doi: 10.3724/SP.J.1004.2012.00389
[16]
Turney P D.Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[OL]. arXiv Preprint, arXiv:cs/0212033.
[17]
Turney P D.Thumbs Up or Thumbs Down?: Semantic Orientation Applied to Unsupervised Classification of Reviews[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002.
[18]
Wawer A.Mining Co-occurrence Matrices for SO-PMI Paradigm Word Candidates[C]//Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012.
[19]
Krestel R, Siersdorfer S.Generating Contextualized Sentiment Lexica Based on Latent Topics and User Ratings[C]//Proceedings of the 24th ACM Conference on Hypertext and Social Media. ACM, 2013: 129-138.
(Zhong Minjuan, Wan Changxuan, Liu Dexi.Opinion Lexicon Construction Based on Association Rule and Orientation Analysis for Production Review[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(5): 501-509.)
(Yang Xiaoping, Zhang Zhongxia, Wang Liang, et al.Automatic Construction and Optimization of Sentiment Lexicon Based on Word2Vec[J]. Computer Science, 2017, 44(1): 42-47.)
doi: 10.11896/j.issn.1002-137X.2017.01.008
(Feng Chao, Liang Xun, Li Yaping, et al.Construction Method of Chinese Cross-Domain Sentiment Lexicon Based on Word Vector[J]. Journal of Data Acquisition and Processing, 2017, 32(3): 579-587.)