Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (10): 95-102    DOI: 10.11925/infotech.2096-3467.2018.0169
Current Issue | Archive | Adv Search |
Constructing Sentiment Dictionary with Deep Learning: Case Study of Financial Data
Jiaheng Hu1,Yonghua Cen1(),Chengyao Wu2
1School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China
2College of Finance, Nanjing Agricultural University, Nanjing 210095, China
Download: PDF(595 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new method to construct a working sentiment dictionary for sentiment analysis in the field of finance. [Methods] Our method built a sentiment dictionary based on the characteristics of corpus and knowledge base. It also mapped the textual information into vector space using word vector method. With the help of existing general sentiment dictionary, we automatically indexed the training corpus, and created training and forecasting sets with a ratio of 9: 1. Finally, we used Python to establish the neural network classifier of deep learning, and evaluated the emotional polarity of the candidate words in the new dictionary. [Results] The accuracy of the proposed neural network classifier with the training set was 95.02%, while the accuracy with the forecasting set was 95.00%. Our results are better than the existing models. [Limitations] The method of extracting seed words could be further optimized. [Conclusions] The proposed method increases the size of corpus to train the neural network classifiers more effectively. It also extracts the emotion information from the semantic relevance of word vectors. The new sentiment dictionary provides possible directions for future research.

Key wordsSentiment Dictionary      Deep Learning      Financial Field      Word Vector      Neural Network     
Received: 09 February 2018      Published: 12 November 2018

Cite this article:

Jiaheng Hu,Yonghua Cen,Chengyao Wu. Constructing Sentiment Dictionary with Deep Learning: Case Study of Financial Data. Data Analysis and Knowledge Discovery, 2018, 2(10): 95-102.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0169     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I10/95

[1] Smailović J, Gr?ar M, Lavra? N, et al.Stream-based Active Learning for Sentiment Analysis in the Financial Domain[J]. Information Sciences, 2014, 285(C): 181-203.
[2] Li X, Xie H, Chen L, et al.News Impact on Stock Price Return via Sentiment Analysis[J]. Knowledge-Based Systems, 2014, 69: 14-23.
[3] Nguyen T H, Shirai K, Velcin J.Sentiment Analysis on Social Media for Stock Movement Prediction[J]. Expert Systems with Applications, 2015, 42(24): 9603-9611.
[4] Wu D D, Zheng L, Olson D L.A Decision Support Approach for Online Stock Forum Sentiment Analysis[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2014, 44(8): 1077-1087.
[5] 王科, 夏睿. 情感词典自动构建方法综述[J]. 自动化学报, 2016, 42(4): 495-511.
[5] (Wang Ke, Xia Rui.A Survey on Automatical Construction Methods of Sentiment Lexicons[J]. Acta Automatica Sinica, 2016, 42(4): 495-511. )
[6] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
[7] Strapparava C, Valitutti A.WordNet Affect: An Affective Extension of WordNet[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004.
[8] Kamps J, Marx M, Mokken R, et al.Using WordNet to Measure Semantic Orientations of Adjectives[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004.
[9] Hassan A, Abu-Jbara A, Jha R, et al.Identifying the Semantic Orientation of Foreign Words[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011.
[10] 柳位平, 朱艳辉, 栗春亮, 等. 中文基础情感词词典构建方法研究[J]. 计算机应用, 2009, 29(10): 2875-2877.
[10] (Liu Weiping, Zhu Yanhui, Li Chunliang, et al.Research on Building Chinese Basic Semantic Lexicon[J]. Journal of Computer Applications, 2009, 29(10): 2875-2877. )
[11] Andreevskaia A, Bergler S.Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses[C]// Proceedings the 11th Conference of the European Chapter of the Association for Computational Linguistics. 2006.
[12] Esuli A, Sebastiani F.Pageranking WordNet Synsets: An Application to Opinion Mining[C]// Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007.
[13] Kanayama H, Nasukawa T.Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006.
[14] Xia Y, Cambria E, Hussain A, et al.Word Polarity Disambiguation Using Bayesian Model and Opinion-level Features[J]. Cognitive Computation, 2015, 7(3): 369-380.
[15] 殷春霞, 彭勤科. 利用复杂网络为自由评论鉴定词汇情感倾向性[J]. 自动化学报, 2012, 38(3): 389-398.
[15] (Yin Chunxia, Peng Qinke.Identifying Word Sentiment Orientation for Free Comments via Complex Network[J]. Acta Automatica Sinica, 2012, 38(3): 389-398.)
[16] Turney P D.Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL[OL]. arXiv Preprint, arXiv:cs/0212033.
[17] Turney P D.Thumbs Up or Thumbs Down?: Semantic Orientation Applied to Unsupervised Classification of Reviews[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002.
[18] Wawer A.Mining Co-occurrence Matrices for SO-PMI Paradigm Word Candidates[C]//Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012.
[19] Krestel R, Siersdorfer S.Generating Contextualized Sentiment Lexica Based on Latent Topics and User Ratings[C]//Proceedings of the 24th ACM Conference on Hypertext and Social Media. ACM, 2013: 129-138.
[20] 钟敏娟, 万常选, 刘德喜. 基于关联规则挖掘和极性分析的商品评论情感词典构建[J]. 情报学报, 2016, 35(5): 501-509.
[20] (Zhong Minjuan, Wan Changxuan, Liu Dexi.Opinion Lexicon Construction Based on Association Rule and Orientation Analysis for Production Review[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(5): 501-509.)
[21] 杨小平, 张中夏, 王良, 等. 基于Word2Vec的情感词典自动构建与优化[J]. 计算机科学, 2017, 44(1): 42-47.
[21] (Yang Xiaoping, Zhang Zhongxia, Wang Liang, et al.Automatic Construction and Optimization of Sentiment Lexicon Based on Word2Vec[J]. Computer Science, 2017, 44(1): 42-47.)
[22] 冯超, 梁循, 李亚平, 等. 基于词向量的跨领域中文情感词典构建方法[J]. 数据采集与处理, 2017, 32(3): 579-587.
[22] (Feng Chao, Liang Xun, Li Yaping, et al.Construction Method of Chinese Cross-Domain Sentiment Lexicon Based on Word Vector[J]. Journal of Data Acquisition and Processing, 2017, 32(3): 579-587.)
[23] NTUSD [EB/OL]. [2017-12-15].
[24] TSING [EB/OL]. [2017-12-15].
[25] HowNet[EB/OL]. [2017-12-15]. .
[26] DUTIR[EB/OL]. [2017-12-15]. .
[1] Xiuxian Wen,Jian Xu. Research on Product Characteristics Extraction and Hedonic Price Based on User Comments[J]. 数据分析与知识发现, 2019, 3(7): 42-51.
[2] Zhenyu He,Xiangxiang Dong,Qinghua Zhu. Classifying Baidu Encyclopedia Entries with User Behaviors[J]. 数据分析与知识发现, 2019, 3(6): 117-122.
[3] Kan Liu,Lu Chen. Deep Neural Network Learning for Medical Triage[J]. 数据分析与知识发现, 2019, 3(6): 99-108.
[4] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
[5] Wancheng Chen,Haoran Dai,Yinghan Jin. Appraising Home Prices with HEDONIC Model: Case Study of Seattle, U.S.[J]. 数据分析与知识发现, 2019, 3(5): 19-26.
[6] Jingjing Pei,Xiaoqiu Le. Identifying Coordinate Text Blocks in Discourses[J]. 数据分析与知识发现, 2019, 3(5): 51-56.
[7] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[8] Li Yu,Li Qian,Changlei Fu,Huaming Zhao. Extracting Fine-grained Knowledge Units from Texts with Deep Learning[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[9] Changlei Fu,Li Qian,Huaping Zhang,Huaming Zhao,Jing Xie. Mining Innovative Topics Based on Deep Learning[J]. 数据分析与知识发现, 2019, 3(1): 46-54.
[10] Hui Li,Yaqing Chai. Fine-Grained Sentiment Analysis Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2019, 3(1): 95-103.
[11] Bengong Yu,Peihang Zhang,Qingtang Xu. Selecting Products Based on F-BiGRU Sentiment Analysis[J]. 数据分析与知识发现, 2018, 2(9): 22-30.
[12] Yuemei Xu,Sining Lv,Lianqiao Cai,Xiaoya Zhang. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec[J]. 数据分析与知识发现, 2018, 2(9): 31-41.
[13] Xiaoyu Ma,Han Zhang,Yuhong Zhao. Building Childhood Asthma Prediction Model with Artificial Neural Network and BRFSS Database[J]. 数据分析与知识发现, 2018, 2(8): 10-15.
[14] Xinlei Li,Hao Wang,Xiaomin Liu,Sanhong Deng. Comparing Text Vector Generators for Weibo Short Text Classification[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[15] Wei Lu,Mengqi Luo,Heng Ding,Xin Li. Image Annotation Tags by Deep Learning and Real Users: A Comparative Study[J]. 数据分析与知识发现, 2018, 2(5): 1-10.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn