Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (7-8): 113-121    DOI: 10.11925/infotech.1003-3513.2015.07.15
Current Issue | Archive | Adv Search |
Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets
Nie Hui1, Rong Zhe2
1 School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China;
2 Business School, Sun Yat-Sen University, Guangzhou 510275, China
Download: PDF(673 KB)   HTML  
Export: BibTeX | EndNote (RIS)      

[Objective] Use review sentiment feature sets extracted by dictionary matching method and machine learning method to predict review's helpfulness. [Methods] This paper adopts sentiment dictionary matching method and machine learning classification method to extract review sentiment feature sets, including building sentiment dictionary, designing appropriate matching algorithm and choosing the best sentiment classifier. Random forest algorithm is applied to predict review's helpfulness with different sentiment feature sets. [Results] The combination of two sentiment analysis methods performs best in predicting review helpfulness. Review's average sentiment score and deviation score derived from sentiment dictionary method have better prediction performance to review helpfulness. [Limitations] Only focused on reviews of search product but neglected the reviews of experience product. The research dataset is limited. [Conclusions] The combination of sentiment dictionary matching method and machine learning method can predict review helpfulness effectively.

Received: 20 January 2015      Published: 25 August 2015
:  TP391  

Cite this article:

Nie Hui, Rong Zhe. Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets. New Technology of Library and Information Service, 2015, 31(7-8): 113-121.

URL:     OR

[1] 杨铭, 祁巍, 闫相斌, 等. 在线商品评论的效用分析研究[J]. 管理科学学报, 2012, 15(5): 65-75. (Yang Ming, Qi Wei, Yan Xiangbin, et al. Utility Analysis for Online Product Review [J]. Journal of Management Sciences in China, 2012, 15(5): 65-75.)
[2] Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), Seattle, Washington, USA. New York: ACM, 2004: 168-177.
[3] Chen C C, Tseng Y. Quality Evaluation of Product Reviews Using an Information Quality Framework [J]. Decision Support Systems, 2011, 50(4): 755-768.
[4] Jin J, Liu Y. How to Interpret the Helpfulness of Online Product Reviews: Bridging the Needs Between Customers and Designers [C]. In: Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents (SMUC'10), Toronto, Ontario, Canada. New York: ACM, 2010: 87-94.
[5] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification Using Machine Learning Techniques [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'02), Stroudsburg, Philadelphia, USA. Stroudsburg: ACL, 2002: 79-86.
[6] Ghose A, Ipeirotis P G. Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews [C]. In: Proceedings of the 9th International Conference on Electronic Commerce (ICEC'07), Minneapolis, Minnesota, USA. New York: ACM, 2007: 303-310.
[7] 郝媛媛, 叶强, 李一军. 基于影评数据的在线评论有用性影响因素研究[J]. 管理科学学报, 2010, 13(8): 78-88. (Hao Yuanyuan, Ye Qiang, Li Yijun. Research on Online Impact Factors of Customer Reviews Usefulness Based on Movie Reviews Data [J]. Journal of Management Sciences in China, 2010, 13(8): 78-88.)
[8] Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association [J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
[9] 姚天昉, 娄德成. 汉语情感词语义倾向判别的研究[C]. 见: 中文计算技术与语言问题研究-第7届中文信息处理国际会议论文集. 北京: 电子工业出版社, 2007. (Yao Tianfang, Lou Decheng. Research on Semantic Orientation Distinction for Chinese Sentiment Words [C]. In: Proceedings of International Conference on Chinese Computing 2007 (ICCC'07), Wuhan, China. Beijing: Publishing House of Electronics Industry, 2007.)
[10] Pak A, Paroubek P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining [C]. In: Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta. 2010: 19-21.
[11] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1): 1-4. (Liu Zhiming, Liu Lu. Empirical Study of Sentiment Classification for Chinese Microblog Based on Machine Learning [J]. Computer Engineering and Applications, 2012, 48(1): 1-4.)
[12] Tan S, Zhang J. An Empirical Study of Sentiment Analysis for Chinese Documents [J]. Expert Systems with Applications, 2008, 34(4): 2622-2629.
[13] Siersdorfer S, Chelaru S, Nejdl W, et al. How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings [C]. In: Proceedings of the 19th International Conference on World Wide Web (WWW'10), Raleigh, North Carolina, USA. New York: ACM, 2006: 891-900.
[14] Kim S, Pantel P, Chklovski T, et al. Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP'06), Sydney, Australia. Stroudsburg: ACL, 2006: 423-430.
[15] Ghose A, Ipeirotis P G. Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[16] Liu J, Cao Y, Lin C, et al. Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. Stroudsburg: ACL, 2007: 334-342.
[17] Jindal N, Bing L. Analyzing and Detecting Review Spam[C]. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM'07), Omaha, USA. IEEE Computer Society, 2007: 547-552.
[18] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition [J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[19] Bishop C. Pattern Recognition and Machine Learning [M]. Springer, 2007: 217-218.
[20] Kim S, Han K, Rim H, et al. Some Effective Techniques for Naive Bayes Text Classification [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(11): 1457-1466.
[21] Breiman L. Random Forest [J]. Machine Learning, 2001, 45(1): 5-32.

[1] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[2] Zhongxi You,Weina Hua,Xuelian Pan. Matching Book Reviews and Essential Sentiment Lexicons with Chinese Word Segmenters[J]. 数据分析与知识发现, 2019, 3(7): 23-33.
[3] Peng Guan,Yuefen Wang,Zhu Fu. Analyzing Topic Semantic Evolution with LDA: Case Study of Lithium Ion Batteries[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[4] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[5] Beibei Kong,Jing Xie,Li Qian,Zhijun Chang,Zhenxin Wu. Methodology and Tools to Enrich Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(7): 113-122.
[6] Fan Xuexue, Wang Zhirong, Xu Wu, Liang Yin, Ma Xiaohu. Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology[J]. 现代图书情报技术, 2015, 31(12): 57-64.
[7] Ren Haiying, Yu Liting. A Multi-strategy Method for Word Sense Disambiguation Based on Wikipedia[J]. 现代图书情报技术, 2015, 31(11): 18-25.
[8] Du Kun, Liu Huailiang, Guo Lujie. Study on the Modified Method of Feature Weighting with Complex Networks[J]. 现代图书情报技术, 2015, 31(11): 26-32.
[9] Ye Chuan, Ma Jing. Research on Topic Discovery Algoritm of Multimedia Microblog Comments Information[J]. 现代图书情报技术, 2015, 31(11): 51-59.
[10] Xie Xiaqing, Wu Xu. Application of Visualization Technology for “Classic Reading” Platform[J]. 现代图书情报技术, 2015, 31(11): 96-103.
[11] He Yu, Lv Xueqiang, Xu Liping. A Chinese Term Extraction System in New Energy Vehicles Domain[J]. 现代图书情报技术, 2015, 31(10): 88-94.
[12] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[13] Xu Deshan, Li Hui, Zhang Yunliang. A Method of Keywords Annotation Based on Linked Triples[J]. 现代图书情报技术, 2015, 31(9): 31-37.
[14] Dun Wenjie, Sun Yigang, Zhu Xianzhong. Design and Realization of Multimedia Document Structure of Internet TV[J]. 现代图书情报技术, 2015, 31(9): 82-89.
[15] Chen Shiqin, Li Wenjiang. Application of WebSocket in Library Mobile Information Service[J]. 现代图书情报技术, 2015, 31(9): 90-96.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938