Building Product Feature Dictionary with Large-scale Review Data
Li Weiqing1,2, Wang Weijun2()
1(School of Information Management, Central China Normal University, Wuhan 430079, China) 2(Key Laboratory of Adolescent Cyberpsychology and Behavior, Ministry of Education, Central China Normal University, Wuhan 430079, China)
[Objective] This paper proposes a method to build product feature dictionary based on large scale review data, aiming to improve its precision and recall. [Methods] First, we constructed a seed dictionary by manually labeling and extending the synonym forest. Then we trained the word vector with large scale product reviews to calculate the semantic similarity and relevance of words. Finally, we identified and categorized the product features to construct the dictionary. [Results] We chose product reviews on mobile-phones, cameras and books to examine the proposed model, which had average precision and recall of 0.774 and 0.855. [Limitations] The proposed method required a great deal of human participation at the marking and verification stages, while it did not consider the implied features of product reviews. [Conclusions] The proposed method could effectively build feature dictionary with better recall.
Mathapati S, Manjula S H.Sentiment Analysis and Opinion Mining from Social Media: A Review[J]. Global Journal of Computer Science and Technology, 2016, 16(5): 1-16.
[2]
Kim Y, Jeong S R.Opinion-Mining Methodology for Social Media Analytics[J]. KSII Transactions on Internet and Information Systems, 2015, 9(1): 391-406.
doi: 10.3837/tiis.2015.01.024
[3]
Awrahman B, Alatas B.Sentiment Analysis and Opinion Mining Within Social Networks Using Konstanz Information Miner[J]. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 2016, 9(1): 15-22.
[4]
Li N, Wu D D.Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast[J]. Decision Support Systems, 2010, 48(2): 354-368.
doi: 10.1016/j.dss.2009.09.003
(Shi Wei, Wang Hongwei, He Shaoyi.Product Reviews Mining from Microblogging Based on Sentiment Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2014, 32(12): 1311-1321.)
doi: 10.3772/j.issn.10000135.2014.012.008
[6]
Liu B, Hu M, Cheng J.Opinion Observer: Analyzing and Comparing Opinions on the Web[C]// Proceedings of the 14th International Conference on World Wide Web. ACM, 2005: 342-351.
[7]
Popescu A M, Etzioni O.Extracting Product Features and Opinions from Reviews[A]// Natural Language Processing and Text Mining[M]. Springer London, 2007:9-28.
[8]
Somprasertsri G, Lalitrojwong P.Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization[J]. Journal of Essential Oil Research, 2010, 16(6): 938-955.
doi: 10.3217/jucs-016-06-0938
(Wu Suhong, Wang Suge.Feature-Opinion Extraction in Scenic Spots Reviews Based on Dependency Relation[J]. Journal of Chinese Information Processing, 2012, 26(3): 116-121.)
doi: 10.3969/j.issn.1003-0077.2012.03.020
(Meng Yuan, Wang Hongwei.Extracting Product Feature and User Opinion from Chinese Reviews[J]. New Technology of Library and Information Service, 2016(2): 16-24.)
(Xi Yahui, Zhang Ming, Yuan Fang, et al.A Survey of Product Reviews Mining[J]. Journal of Shandong University: Natural Science, 2011, 46(5): 16-24.)
doi: 10.3778/j.issn.1002-8331.2008.36.010
[12]
Aravindan S, Ekbal A.Feature Extraction and Opinion Mining in Online Product Reviews[C]// Proceedings of the 2014 International Conference on Information Technology. IEEE, 2015:94-99.
(Li Shi, Ye Qiang, Li Yijun, et al.Mining Features of Products from Chinese Customer Online Reviews[J]. Journal of Management Sciences in China, 2009, 12(2): 142-152.)
(Shi Wei, Wang Hongwei, He Shaoyi.Study on Construction of Fuzzy Emotion Ontology Based on HowNet[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(6): 595-602.)
doi: 10.3772/j.issn.1000-0135.2012.06.005
[15]
Wang B, Wang H.Bootstrapping both Product Properties and Opinion Words from Chinese Reviews with Cross-Training[C] // Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 2007: 259-262.
[16]
Cheng X.Automatic Topic Term Detection and Sentiment Classification for Opinion Mining [D]. Saarbrücken, Germany: The University of Saarland, 2007.
(Zu Lijun, Wang Weiping.Research of Extracting Product Features from Chinese Online Reviews[J]. Computer Systems & Applications, 2014, 23(5): 196-201.)
(Li Sujian, Liu Qun.Research on Definition and Acquisition of Chunk[C] // Proceedings of the 7th National Conference on Computational Linguitics. Beijing: Tsinghua University Press, 2003: 110-115.)
[19]
Xia Y Q, Xu R F, Wong K F, et al.The Unified Collocation Framework for Opinion Mining[C]//Proceedings of the 2007 International Conference on Machine Learning and Cybernetics. IEEE, 2007:844-850.
[20]
黄永文. 中文产品评论挖掘关键技术研究[D]. 重庆: 重庆大学, 2009.
[20]
(Huang Yongwen.Research on Key Mining Technologies of Product Reviews in Chinese [D]. Chongqing: Chongqing University, 2009.)
(Chen Jiong, Zhang Hu, Cao Fuyuan, et al.Research on Product Feature Extraction from Chinese Customer Reviews[J]. Computer Engineering and Design, 2012, 33(3): 1245-1250.)
doi: 10.3969/j.issn.1000-7024.2012.03.080
[22]
Xia R, Xu F, Zong C, et al.Dual Sentiment Analysis: Considering Two Sides of One Review[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(8): 2120-2133.
doi: 10.1109/TKDE.2015.2407371
[23]
Liu S, Cheng X, Li F, et al.TASC: Topic-Adaptive Sentiment Classification on Dynamic Tweets[J]. IEEE Transactions on Knowledge & Data Engineering, 2015, 27(6): 1696-1709.
doi: 10.1109/TKDE.2014.2382600
[24]
Hai Z, Chang K, Kim J J, et al.Identifying Features in Opinion Mining via Intrinsic and Extrinsic Domain Relevance[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(3): 623-634.
doi: 10.1109/TKDE.2013.26
[25]
Asghar M Z, Khan A, Ahmad S, et al.A Review of Feature Extraction in Sentiment Analysis[J]. Journal of Basic & Applied Research International, 2014, 4(3): 181-186.
(Liu Dandan, Peng Cheng, Qian Longhua,et al.The Effect of TongYiCi CiLin in Chinese Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2014, 28(2): 91-99.)
doi: 10.3969/j.issn.1003-0077.2014.02.014
(Du Jiazhong, Xu Jian, Liu Ying.Research on Construction of Feature-Sentiment Ontology and Sentiment Analysis[J]. New Technology of Library and Information Service, 2014(5): 74-82.)
(Hou Yinxiu, Li Weiqing, Wang Weijun, et al.Personalized Book Recommendation Based on User Preferences and Commodity Features[J]. Data Analysis and Knowledge Discovery, 2017, 1(8): 9-17.)
(Tian Jiule, Zhao Wei.Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive Learning System[J]. Journal of Jilin University: Information Science Edition, 2010, 28(6): 602-608.)
doi: 10.3969/j.issn.1671-5896.2010.06.011
[30]
Song H, Fan Y, Liu X, et al.Extracting Product Features from Online Reviews for Sentimental Analysis[C]// Proceedings of the 6th International Conference on Computer Sciences and Convergence Information Technology. 2011: 745-750.
[31]
Jeong H.FEROM: Feature Extraction and Refinement for Opinion Mining[J]. ETRI Journal, 2011, 33(5): 720-730.
doi: 10.4218/etrij.11.0110.0627
[32]
Liu B.Sentiment Analysis and Opinion Mining[J]. Synthesis Lectures on Human Language Technologies, 2016, 30(1): 152-153.
doi: 10.1007/978-1-4899-7502-7_907-1
(Tang Xiaobo, Lan Yuting.Study on Evolution Process of Network Information Ecological Chain from the Perspective of Complex Networks[J]. Library and Information Service, 2016, 60(16): 121-127.)
(Yang Yang, Liu Longfei, Wei Xianhui, et al.New Methods for Extracting Emotional Words Based on Distributed Representations of Words[J]. Journal of Shandong University: Natural Science, 2014, 49(11): 51-58.)
doi: 10.6040/j.issn.1671-9352.3.2014.255
[35]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781v3.