|
|
Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets |
Nie Hui1, Rong Zhe2 |
1 School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China;
2 Business School, Sun Yat-Sen University, Guangzhou 510275, China |
|
|
Abstract [Objective] Use review sentiment feature sets extracted by dictionary matching method and machine learning method to predict review's helpfulness. [Methods] This paper adopts sentiment dictionary matching method and machine learning classification method to extract review sentiment feature sets, including building sentiment dictionary, designing appropriate matching algorithm and choosing the best sentiment classifier. Random forest algorithm is applied to predict review's helpfulness with different sentiment feature sets. [Results] The combination of two sentiment analysis methods performs best in predicting review helpfulness. Review's average sentiment score and deviation score derived from sentiment dictionary method have better prediction performance to review helpfulness. [Limitations] Only focused on reviews of search product but neglected the reviews of experience product. The research dataset is limited. [Conclusions] The combination of sentiment dictionary matching method and machine learning method can predict review helpfulness effectively.
|
Received: 20 January 2015
Published: 25 August 2015
|
|
[1] 杨铭, 祁巍, 闫相斌, 等. 在线商品评论的效用分析研究[J]. 管理科学学报, 2012, 15(5): 65-75. (Yang Ming, Qi Wei, Yan Xiangbin, et al. Utility Analysis for Online Product Review [J]. Journal of Management Sciences in China, 2012, 15(5): 65-75.)
[2] Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), Seattle, Washington, USA. New York: ACM, 2004: 168-177.
[3] Chen C C, Tseng Y. Quality Evaluation of Product Reviews Using an Information Quality Framework [J]. Decision Support Systems, 2011, 50(4): 755-768.
[4] Jin J, Liu Y. How to Interpret the Helpfulness of Online Product Reviews: Bridging the Needs Between Customers and Designers [C]. In: Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents (SMUC'10), Toronto, Ontario, Canada. New York: ACM, 2010: 87-94.
[5] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification Using Machine Learning Techniques [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'02), Stroudsburg, Philadelphia, USA. Stroudsburg: ACL, 2002: 79-86.
[6] Ghose A, Ipeirotis P G. Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews [C]. In: Proceedings of the 9th International Conference on Electronic Commerce (ICEC'07), Minneapolis, Minnesota, USA. New York: ACM, 2007: 303-310.
[7] 郝媛媛, 叶强, 李一军. 基于影评数据的在线评论有用性影响因素研究[J]. 管理科学学报, 2010, 13(8): 78-88. (Hao Yuanyuan, Ye Qiang, Li Yijun. Research on Online Impact Factors of Customer Reviews Usefulness Based on Movie Reviews Data [J]. Journal of Management Sciences in China, 2010, 13(8): 78-88.)
[8] Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association [J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
[9] 姚天昉, 娄德成. 汉语情感词语义倾向判别的研究[C]. 见: 中文计算技术与语言问题研究-第7届中文信息处理国际会议论文集. 北京: 电子工业出版社, 2007. (Yao Tianfang, Lou Decheng. Research on Semantic Orientation Distinction for Chinese Sentiment Words [C]. In: Proceedings of International Conference on Chinese Computing 2007 (ICCC'07), Wuhan, China. Beijing: Publishing House of Electronics Industry, 2007.)
[10] Pak A, Paroubek P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining [C]. In: Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta. 2010: 19-21.
[11] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1): 1-4. (Liu Zhiming, Liu Lu. Empirical Study of Sentiment Classification for Chinese Microblog Based on Machine Learning [J]. Computer Engineering and Applications, 2012, 48(1): 1-4.)
[12] Tan S, Zhang J. An Empirical Study of Sentiment Analysis for Chinese Documents [J]. Expert Systems with Applications, 2008, 34(4): 2622-2629.
[13] Siersdorfer S, Chelaru S, Nejdl W, et al. How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings [C]. In: Proceedings of the 19th International Conference on World Wide Web (WWW'10), Raleigh, North Carolina, USA. New York: ACM, 2006: 891-900.
[14] Kim S, Pantel P, Chklovski T, et al. Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP'06), Sydney, Australia. Stroudsburg: ACL, 2006: 423-430.
[15] Ghose A, Ipeirotis P G. Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[16] Liu J, Cao Y, Lin C, et al. Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. Stroudsburg: ACL, 2007: 334-342.
[17] Jindal N, Bing L. Analyzing and Detecting Review Spam[C]. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM'07), Omaha, USA. IEEE Computer Society, 2007: 547-552.
[18] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition [J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[19] Bishop C. Pattern Recognition and Machine Learning [M]. Springer, 2007: 217-218.
[20] Kim S, Han K, Rim H, et al. Some Effective Techniques for Naive Bayes Text Classification [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(11): 1457-1466.
[21] Breiman L. Random Forest [J]. Machine Learning, 2001, 45(1): 5-32. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|