[Objective] This paper aims to effectively extract multi-dimensional characteristics of online reviews and then examine the impact of text content to the review quality evaluation. [Methods] First, we quantified and extracted content features based on the textual and sentimental message from the reviews. Then, adopted the GBDT model to evaluate the influence of feature sets to classification results, along with greedy feature selection procedure to identify the most effective content features. Finally, we examined the influences of these features. [Results] The proposed method could improve the performance of review quality evaluation tasks, especially the recall and precision of the new system. [Limitations] Our research focused on review data from search services, and did not investigate products like movies and music. [Conclusions] The information gained from reviews and product feature words, degree of sentimental objectiveness, and differences among review contents all posed important effects to review quality evaluation.
孟园,王洪伟. 基于文本内容特征选择的评论质量检测*[J]. 现代图书情报技术, 2016, 32(4): 40-47.
Meng Yuan,Wang Hongwei. Evaluating Online Reviews Based on Text Content Features. New Technology of Library and Information Service, 2016, 32(4): 40-47.
(Nie Hui, Rong Zhe.Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets[J]. New Technology of Library and Information Service, 2015(7-8): 104-112.)
(Gao Ya, Li Hong, Shi Huibin.The Research of the Impact Factors of the Online Review Votes[J]. China Management Informationization, 2012, 15(17): 88-91.)
(Yan Jianyuan, Zhang Li, Zhang Lei.An Empirical Study of the Impact of Review Content on Online Reviews Helpfulness in E-Commerce[J]. Information Science, 2012, 30(5): 713-719.)
(Yang Shuang.The Impact Mechanism of Information Quality and Community Status on Perceived Usefulness for User-Generated Product Reviews——Tobit Regression Analysis[J]. Management Review, 2013, 25(5): 136-143, 154.)
(Yin Guopeng.What is the Kind of Online Reviews that Consumer Think are More Useful? The Effect of Social Factors Influence[J]. Management World, 2012(12): 115-124. )
[7]
Kim S M, Pantel P, Chklovski T, et al.Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sydney, Australia. Stroudsburg, PA, USA: ACL, 2006: 423-430.
[8]
Ghose A, Ipeirotis P G.Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[9]
Li F, Zhang Y L, Dang Y, et al.Analyzing Sentiments in Web2.0 Social Medial Data in Chinese: Experiments on Business and Marketing Related Chinese Web Forums[J]. Information Technology Management, 2013(14): 231-242.
[10]
Liu Y, Jin J, Ji P, et al.Identifying Helpful Online Reviews: A Product Designer’s Perspective[J]. Computer-Aided Design, 2013, 45(2): 180-194.
[11]
Chen C C, Tseng Y-D.Quality Evaluation of Product Reviews Using an Information Quality Framework[J]. Decision Support Systems, 2011, 50(4): 755-768.
(Wang Wei, Wang Hongwei.The Influence of Aspect- based Opinions on User’s Purchase Intention Using Sentiment Analysis of Online Reviews[J]. Systems Engineering—— Theory & Practice, 2016, 36(1): 63-76.)
[13]
Ayaru L, Ypsilantis P-P, Nanapragasam A, et al.Prediction of Outcome in Acute Lower Gastrointestinal Bleeding Using Gradient Boosting[J]. PLoS ONE, 2015, 10(7). DOI: 10.1371/journal.pone.0132485.
[14]
Semanjski I, Gautama S.Smart City Mobility Application- Gradient Boosting Trees for Mobility Prediction and Analysis Based on Crowd Sourced Data[J]. Sensors, 2015, 15(7): 15974-15987.
[15]
Zhang R, Tran T.An Information Gain-based Approach for Recommending Useful Product Reviews[J]. Knowledge and Information Systems, 2011, 26(3): 419-434.
[16]
Jindal N, Liu B.Review Spam Detection [C]. In: Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada. New York, NY, USA: ACM, 2007: 1189-1190.
[17]
吴军. 数学之美[M]. 北京: 人民邮电出版社, 2012:60-64.
[17]
(Wu Jun.The Beauty of Mathmatics [M]. Beijing: Posts and Telecom Press, 2012: 60-64.)
[18]
Pang B, Lee L.A Sentiment Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts [C]. In: Proceeding of the 42nd Annual Meeting of the Association for Computational Linguistic (ACL). Morristown, NJ, USA: ACL, 2004: 271-278.
[19]
Jiang Z P, Ng H T.Semantic Role Labeling of NomBank: A Maximum Entropy Approach [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: ACL, 2006: 138-145.
[20]
Liu J, Cao Y, Lin C-Y, et al.Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on EMNLP-CoNLL. ACL Press, 2007: 334-342.