Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (12): 1-9    DOI: 10.11925/infotech.2096-3467.2017.0618
Orginal Article Current Issue | Archive | Adv Search |
Examining Product Reviews with Sentiment Analysis and Opinion Mining
Bo Guo1(),Shouguang Li1,Hao Wang1,Xiaojun Zhang1,Wei Gong1,Zhaojun Yu1,Yu Sun2
1Meizu Telecom Equipment Co., Ltd., Beijing 100872, China
2Computer Science Department, California State Polytechnic University, Pomona 91768, USA
Download: PDF(1009 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      

[Objective] This study conducts a comprehensive analysis of huge amount of reviews generated by E-commerce website users, aiming to assess the marketing strategies. [Methods] We used syntactic parsing, bag of words model and machine learning techniques to examine real-world datasets from JD and TMall. The proposed method could analyze sentiment and extract opinion from the reviews automatically. [Results] The accuracy of the sentiment analysis was 90%. We constructed an automatic vocabulary building mechanism without dictionary dependency. The F-measure of the new system was 71%. [Limitations] The recall of the opinion extraction needs to be improved. [Conclusions] The proposed system could effectively monitor the word-of-mouth issues facing products sold online. It could be transferred to many online business.

Key wordsUser Review      Sentimental Analysis      Opinion Mining      Machine Learning      Tag Extraction     
Received: 29 June 2017      Published: 29 December 2017

Cite this article:

Bo Guo,Shouguang Li,Hao Wang,Xiaojun Zhang,Wei Gong,Zhaojun Yu,Yu Sun. Examining Product Reviews with Sentiment Analysis and Opinion Mining. Data Analysis and Knowledge Discovery, 2017, 1(12): 1-9.

URL:     OR

[1] CNNIC. 2015年中国网络购物市场研究报告[R]. 北京: 中国互联网络信息中心, 2016.
[1] (CNNIC. 2015 China Online Shopping Market Research Report [R]. Beijing: China Internet Network Information Center, 2016.)
[2] Agarwal B, Mittal N.Machine Learning Approaches for Sentiment Analysis[A]// Prominent Feature Extraction for Sentiment Analysis[M]. Springer International Publishing, 2016: 21-45.
[3] Yi J, Nasukawa T, Bunescu R.Sentiment Analyzer: Extracting Sentiments about a Given Topic Using Natural Language Processing Techniques[C]//Proceedings of the IEEE International Conference on Data Mining (ICDM). 2003: 427-434.
[4] Shuster S, Shaw E.Alignment of Standards Using WordNet for Assessing K-12 Engineering Practices in a Participatory Learning Environment[C] // Proceedings of International Conference on Advanced Technologies Enhancing Education. 2017.
[5] Amaral K M, Chen P, Crouter S, et al.Bag-of-Words Method Applied to Accelerometer Measurements for the Purpose of Classification and Energy Estimation [OL]. arXiv Preprint. arXiv: 1704. 01574.
[6] Pang B, Lee L, Vaithyanathan S.Thumbs up?: Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. 2002: 79-86.
[7] Hatzivassiloglou V, Wiebe J M.Effects of Adjective Orientation and Gradability on Sentence Subjectivity[C] //Proceedings of the 18th Conference on Computational Linguistics- Volume 1. 2000: 299-305.
[8] Ku L-W, Liang Y-T, Chen H-H, et al.Opinion Extraction, Summarization and Tracking in News and Blog Corpora[C]// Proceedings of AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs. 2006.
[9] Marrese-Taylor E, Matsuo Y.Replication Issues in Syntax-based Aspect Extraction for Opinion Mining[OL]. arXiv Preprint. arXiv: 1701.01565.
[10] Sokal A.SentiCompass: Interactive Visualization for Exploring and Comparing the Sentiments of Time-varying Twitter Data[C]// Proceedings of Visualization Symposium. IEEE, 2015: 129-133.
[11] Hatzivassiloglou V, McKeown K R. Predicting the Semantic Orientation of Adjectives[C] // Proceedings of the 8th Conference on European Chapter of the Association for Computational Linguistics. 1997: 174-181.
[12] Wiebe J.Learning Subjective Adjectives from Corpora[C]// Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence. 2000: 735-740.
[13] Kaji N, Kitsuregawa M.Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents[C] //Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007: 1075-1083.
[14] Kanayama H, Nasukawa T.Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis[C] // Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006: 355-363.
[15] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004: 168-177.
[16] Qiu G, Liu B, Bu J, et al.Expanding Domain Sentiment Lexicon Through Double Propagation[C] //Proceedings of the International Joint Conference on Artificial Intelligence. 2009: 1199-1204.
[17] Serdah A M, Ashour W M.Clustering Large-scale Data Based on Modified Affinity Propagation Algorithm[J]. Journal of Artificial Intelligence and Soft Computing Research, 2016, 6(1): 23-33.
[18] Van Nguyen T, Nguyen A T, Phan H D, et al.Combining Word2Vec with Revised Vector Space Model for Better Code Retrieval [C] // Proceedings of the 39th International Conference on Software Engineering Companion. IEEE Press, 2017: 183-185.
[19] Su Q, Xiang K, Wang H, et al.Using Pointwise Mutual Information to Identify Implicit Features in Customer Reviews[C]//Proceedings of International Conference on Computer Processing of Oriental Languages (ICCPOL). 2006, 4285: 22-30.
[20] Strand J, Carson R T, Navrud S, et al.Using the Delphi Method to Value Protection of the Amazon Rainforest[J]. Ecological Economics, 2017, 131: 475-484.
[21] Guo B, Wang H, Yu Z, et al.Detecting Spammers in E-Commerce Website via Spectrum Features of User Relation Graph[C] //Proceedings of 2017 International Conference on Advanced Cloud and Big Data (CBD), Shanghai, China. 2017: 324-330.
[22] Guo B, Wang H, Yu Z, et al.Detecting the Internet Water Army via Comprehensive Behavioral Features Using Large-scale E-commerce Reviews[C]//Proceedings of 2017 International Conference on Computer, Information and Telecommunication Systems (CITS), Dalian, China. 2017: 88-92.
[1] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[2] Jinzhu Zhang,Yiming Hu. Extracting Titles from Scientific References in Patents with Fusion of Representation Learning and Machine Learning[J]. 数据分析与知识发现, 2019, 3(5): 68-76.
[3] Zhiqiang Liu,Yuncheng Du,Shuicai Shi. Extraction of Key Information in Web News Based on Improved Hidden Markov Model[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[4] Hongxia Xu,Chunwang Li. Review of Knowledge Extraction of Scientific Literature[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[5] Guijun Yang,Xue Xu,Fuqiang Zhao. Predicting User Ratings with XGBoost Algorithm[J]. 数据分析与知识发现, 2019, 3(1): 118-126.
[6] Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs[J]. 数据分析与知识发现, 2019, 3(1): 72-84.
[7] Lina Liu,Jiayin Qi,Zhenping Zhang,Dan Zeng. Analyzing Impacts of Brand Reputation on Online Sales Based on Massive Commodity Reviews and Brand[J]. 数据分析与知识发现, 2018, 2(9): 10-21.
[8] Longjia Jia,Bangzuo Zhang. Classifying Topics of Internet Public Opinion from College Students: Case Study of Sina Weibo[J]. 数据分析与知识发现, 2018, 2(7): 55-62.
[9] Wei Lu,Mengqi Luo,Heng Ding,Xin Li. Image Annotation Tags by Deep Learning and Real Users: A Comparative Study[J]. 数据分析与知识发现, 2018, 2(5): 1-10.
[10] Li Wang,Lixue Zou,Xiwen Liu. Visualizing Document Correlation Based on LDA Model[J]. 数据分析与知识发现, 2018, 2(3): 98-106.
[11] Xinyue Fan,Lei Cui. Predicting Antineoplastic Drug Targets Based on Network Properties[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
[12] Yang Zhao,Xini Yuan,Yawen Chen,Liqiang Wu. Predicting Conversion Rate of APP Advertising with Machine Learning[J]. 数据分析与知识发现, 2018, 2(11): 2-9.
[13] Xin Wang,Wen’gang Feng. Review of Techniques Detecting Online Extremism and Radicalization[J]. 数据分析与知识发现, 2018, 2(10): 2-8.
[14] Weiqing Li,Weijun Wang. Building Product Feature Dictionary with Large-scale Review Data[J]. 数据分析与知识发现, 2018, 2(1): 41-50.
[15] Zhongyi Hu,Chaoqun Wang,Jiang Wu. Identifying Phishing Websites with Multiple Online Data Sources[J]. 数据分析与知识发现, 2017, 1(6): 47-55.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938