Please wait a minute...
Advanced Search
数据分析与知识发现  2017, Vol. 1 Issue (3): 62-71     https://doi.org/10.11925/infotech.2096-3467.2017.03.08
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
评论文本对酒店满意度的影响: 基于情感分析的方法
吴维芳, 高宝俊(), 杨海霞, 孙含琳
武汉大学经济与管理学院 武汉 430072
The Impacts of Reviews on Hotel Satisfaction: A Sentiment Analysis Method
Wu Weifang, Gao Baojun(), Yang Haixia, Sun Hanlin
Economics and Management School, Wuhan University, Wuhan 430072, China
全文: PDF (2031 KB)   HTML ( 31
输出: BibTeX | EndNote (RIS)      
摘要 

目的】通过对评论文本进行文本分析, 研究影响酒店用户满意度的因素, 为酒店管理者提供建议。【方法】利用Word2Vec对Tripadvisor.com酒店评论进行特征抽取和降维, 结合情感分析技术, 提取每类特征对应的情感, 构建计量经济模型分析酒店特征评价与用户满意度的关系。【结果】研究结果表明: (1)评论文本的情感表达越积极满意度越高, 但这种影响并非线性的, 而是呈现“U”形的; (2)用户评论文本中提到的特征类别数越多, 该用户越有可能倾向不满意; (3)消费者对豪华型酒店和经济型酒店特征类别的关注存在显著差异, 消费者对前者更关注员工服务, 对后者更注重清洁度; (4)对豪华型酒店, 消费者满意度受到网络(Internet)这个特征维度的显著影响, 而对于经济型酒店该维度的影响则不显著。【局限】样本的选择不够全面, 未来可爬取多个城市数据进行更全面分析。【结论】从评论文本角度建立了酒店特征与消费者满意度的联系, 为酒店在线口碑研究提供了理论依据。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
吴维芳
高宝俊
杨海霞
孙含琳
关键词 评论文本酒店特征情感分析消费者满意度    
Abstract

[Objective] This paper analyzes the online hotel reviews to identify the factors influencing the customer’s satisfaction, and then provides suggestion to the management. [Methods] First, we extracted features and reduced dimensionality of travelers’ comments from Tripadvisor.com with the help of Word2Vec technique. Secondly, we extracted the characteristics of each type of the corresponding emotion based on sentiment analysis technology. Finally, we constructed an econometric model to analyze the correlation between the hotel reviews and users’ satisfaction. [Results] We found that positive reviewers were generally satisfied with the hotel service, however, there was no linear relations between the two factors. The more feature categories mentioned by the user in comments, the more likely he or she was not satisfied. The consumers paid more attention to the staff of the luxury hotels, while cared the cleanliness of the economic ones. Consumers’ attitudes towards luxury hotels were significantly affected by the Internet, which posed less obvious influences to the economic ones. [Limitations] The sample was not comprehensive, and more studies are needed to analyze data from multiple cities. [Conclusions] This study lays theoretical foundation for the online word-of-mouth research from the perspective of user generated contents.

Key wordsComment Text    Hotel Features    Sentiment Analysis    Consumer Satisfaction
收稿日期: 2016-12-05      出版日期: 2017-04-20
ZTFLH:  F59 G350  
引用本文:   
吴维芳, 高宝俊, 杨海霞, 孙含琳. 评论文本对酒店满意度的影响: 基于情感分析的方法[J]. 数据分析与知识发现, 2017, 1(3): 62-71.
Wu Weifang,Gao Baojun,Yang Haixia,Sun Hanlin. The Impacts of Reviews on Hotel Satisfaction: A Sentiment Analysis Method. Data Analysis and Knowledge Discovery, 2017, 1(3): 62-71.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2017.03.08      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2017/V1/I3/62
  本文研究框架
  Tripadvisor.com在线评论
  评论语料高频名词词云图
  Silhouette Coefficient结果
  最优K聚类效果图(K=7)
  酒店特征聚类结果
reviewid Opinion Unit Word_count Label
155764734 the rooms were a great size and layout 8 Facility
155344163 it is fairly clean 4 Cleanliness
117415795 the food was great 2 Food
117474538 the location at the cosmo is great 7 Location
117490549 food amazing 2 Food
118435844 great value for the money 5 Value
118683963 internet is free and fast 5 Internet
143482015 staff was very friendly and helpful. 6 Staff
  评价单元示例
特征 Naive Bayes SVM
Precision Recall Precision Recall
Cleanliness 74% 78% 75% 77%
Facility 78% 64% 82% 82%
Food 87% 83% 86% 83%
Internet 73% 88% 74% 86%
Location 65% 84% 69% 85%
Staff 88% 85% 87% 87%
Value 80% 80% 80% 81%
Total Accuracy 79% Accuracy 80%
  基于机器学习的特征分类效果比较
Statistic N Mean St.Dev. Min Pctl
(25)
Median Pctl
(75)
Max
food_senti 5124 0.21 0.44 -1.64 0 0 0.39 3.05
facilities_senti 5124 0.6 1.05 -4.65 0 0.41 1.12 11.1
value_senti 5124 0.11 0.45 -2.52 0 0 0.22 3.78
staff_senti 5124 0.35 0.73 -4.02 0 0.12 0.71 8.17
cleanliness_senti 5124 0.2 0.56 -2.8 0 0 0.41 4.16
location_senti 5124 0.21 0.38 -1.45 0 0 0.38 2.77
internet_senti 5124 0.05 0.28 -1.74 0 0 0 4.49
location 4310 4.31 0.96 1 3 4 5 5
rooms 4356 3.96 1.17 1 3 4 5 5
value 4862 3.87 1.21 1 3 4 5 5
cleanliness 4842 3.94 1.21 1 3 4 5 5
sleepquality 4074 4.03 1.18 1 3 4 5 5
ave_sentiment 5124 0.24 0.26 -1.53 0.08 0.23 0.39 2.29
AvgRating 5124 3.78 1.22 1 3 4 5 5
  变量的描述性统计分析结果
Food Facilitity Value Staff Clean Location Internet
Food 1 0.18 0.09 0.13 0.14 0.09 0.07
Facilitity 1 0.06 0.22 0.15 0.18 0.07
Value 1 0.10 0.13 0.04 0.11
Staff 1 0.16 0.12 0.08
Cleanliness 1 0.06 0.11
Location 1 -0.003*
Internet 1
  新维度下的相关系数表
Location Rooms Value Clean SleepQuality
Location 1 0.61*** 0.43** 0.49*** 0.53***
Rooms 1 0.57*** 0.73*** 0.72***
Value 1 0.62*** 0.61***
Cleanliness 1 0.65***
SleepQuality 1
  原始维度的相关系数表
Dependent variable: as.factor(AvgRating)
(1) (2) (3) (4)
y≥2 2.0906*** 1.8908*** 2.3932*** 2.1126***
-0.0571 (0.0747) (0.0925) (0.0600)
y≥3 1.0706*** 0.8228*** 1.3668*** 0.9845***
-0.0438 (0.0589) (0.0686) (0.0457)
y≥4 -0.2140*** -0.7763*** 0.3546*** -0.4444***
-0.0402 (0.0577) (0.0605) (0.0429)
y≥5 -1.6895*** -2.5997*** -0.9835*** -1.9997***
-0.0456 (0.0735) (0.0626) (0.0494)
food_senti 0.3006*** 0.4529*** 0.4552***
-0.0626 (0.0786) (0.1131)
facility_senti 0.6049*** 0.5666*** 0.4389***
-0.0296 (0.0415) (0.0440)
value_senti 0.3540*** 0.5665*** 0.6579***
-0.0599 (0.0724) (0.1231)
staff_senti 0.7608*** 0.8931*** 0.7486***
-0.0424 (0.0592) (0.0651)
cleanliness_senti 0.4665*** 0.6457*** 0.7906***
-0.0499 (0.0604) (0.1069)
location_senti 0.5236*** 0.4926*** 0.3709**
-0.0731 (0.0984) (0.1138)
internet_senti 0.0624 0.3743*** -0.1412
-0.0964 (0.1049) (0.3502)
ave_sentiment 6.7401***
(0.1954)
sentiment^2 -3.7624***
(0.2259)
Num_of_feature -0.0802***
(0.0178)
Observations 5, 124 2, 625 2, 499 5, 124
R2 0.2571 0.3437 0.2087 0.3229
chi2 (df = 7) 1 424.3920*** 1 037.0600*** 538.1301*** 1 863.1670***
  回归模型结果
[1] Duan W, Gu B, Whinston A B.Do Online Reviews Matter? - An Empirical Investigation of Panel Data[J]. Decision Support Systems, 2008, 45(4): 1007-1016.
doi: 10.1016/j.dss.2008.04.001
[2] Chevalier J A, Mayzlin D.The Effect of Word of Mouth on Sales: Online Book Reviews[J]. Journal of Marketing Research, 2004, 43(3): 345-354.
[3] 江晓东. 什么样的产品评论最有用?——在线评论数量特征和文本特征对其有用性的影响研究[J]. 外国经济与管理, 2015, 37(4): 41-55.
[3] (Jiang Xiaodong.What is the Most Helpful Product Review? ——The Effect of Online Reviews’ Quantitative and Textual Features on Its Helpfulness[J]. Foreign Economies and Management, 2015, 37(4): 41-55.)
[4] 李爱国, 邓召惠, 毛冰洁. 在线负面评论对体验型产品销量的影响——基于商家回复视角[J]. 商业研究, 2016(7): 138-144.
[4] (Li Aiguo, Deng Zhaohui, Mao Bingjie.Impact of Online Negative Reviews on Experiential Product Sales—— Based on Merchant Replies[J].Commercial Research, 2016(7): 138-144.)
[5] 闫强, 孟跃. 在线评论的感知有用性影响因素——基于在线影评的实证研究[J]. 中国管理科学, 2013, 21(S1): 126-131.
[5] ( Yan Qiang, Meng Yue.Factors Affecting the Perceived Usefulness of Online Reviews——An Empirical Study Based on Online Film Reviews[J].Chinese Journal of Management Science, 2013, 21(S1): 126-131.)
[6] 高宝俊, 孙含琳, 王寒凝. 在线评论对酒店订满率的影响研究[J]. 旅游学刊, 2016, 31(4): 109-117.
doi: 10.3969/j.issn.1002-5006.2016.04.017
[6] (Gao Baojun, Sun Hanlin, Wang Hanning.Influence of Online Reviews on Hotels’ Full-occupancy Rates[J]. Tourism Tribune, 2016, 31(4): 109-117.)
doi: 10.3969/j.issn.1002-5006.2016.04.017
[7] Banerjee S, Chua A Y K. In Search of Patterns Among Travellers’ Hotel Ratings in TripAdvisor[J]. Tourism Management, 2016, 53: 125-131.
doi: 10.1016/j.tourman.2015.09.020
[8] Sonnier G P, McAlister L, Rutz O J. A Dynamic Model of the Effect of Online Communications on Firm Sales[J]. Marketing Science, 2011, 30(4): 702-716.
doi: 10.2307/23012020
[9] Ghose A, Ipeirotis P G, Li B. Examining the Impact of Search Engine Ranking and Personalization on Consumer Behavior: Combining Bayesian Modeling with Randomized Field Experiments[OL]. .
[10] Jerdee T H, Rosen B.Effects of Opportunity to Communicate and Visibility of Individual Decisions on Behavior in the Common Interest[J]. Journal of Applied Psychology, 1974, 59(6): 712-716.
doi: 10.1037/h0037450
[11] Kanayama H, Nasukawa T.Unsupervised Lexicon Induction for Clause-level Detection of Evaluations[J]. Natural Language Engineering, 2012, 18(1): 83-107.
doi: 10.1017/S1351324911000131
[12] Darby M R, Karni E.Free Competition and the Optimal Amount of Fraud[J]. The Journal of Law and Economics, 1973, 16(1): 67-88.
[13] Liu J, Cao Y, Lin C Y, et al.Low-Quality Product Review Detection in Opinion Summarization[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. 2007: 334-342.
[14] Eirinaki M, Pisal S, Singh J.Feature-based Opinion Mining and Ranking[J]. Journal of Computer & System Sciences, 2012, 78(4): 1175-1184.
doi: 10.1016/j.jcss.2011.10.007
[15] Abbasi A, Chen H, Salem A. Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums[J]. ACM Transactions on Information Systems, 2008, 26(3). Article No. 12.
doi: 10.1145/1361684.1361685
[16] Ganu G, Kakodkar Y, Marian A.Improving the Quality of Predictions Using Textual Information in Online User Reviews[J]. Information Systems, 2013, 38(1): 1-15.
doi: 10.1016/j.is.2012.03.001
[17] Archak N, Ghose A, Ipeirotis P G.Deriving the Pricing Power of Product Features by Mining Consumer Reviews[J]. Management Science, 2011, 57(8): 1485-1509.
doi: 10.1287/mnsc.1110.1370
[18] Mai F.Essays in Business Analytics[D]. University of Cincinnati, 2015.
[19] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA. 2004: 168-177.
[20] 王伟, 王洪伟. 特征观点对购买意愿的影响: 在线评论的情感分析方法[J]. 系统工程理论与实践, 2016, 36(1): 63-76.
[20] (Wang Wei, Wang Hongwei.The Influence of Aspect-based Opinions on User’s Purchase Intention Using Sentiment Analysis of Online Reviews[J]. Systems Engineering—Theory&Practice, 2016, 36(1): 63-76.)
[21] Das A, Bandyopadhyay S, Gambäck B. Sentiment Analysis: What is the End User’s Requirement?[C]// Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics. ACM, 2012. Article No. 35.
[22] McGlohon M, Glance N S, Reiter Z. Star Quality: Aggregating Reviews to Rank Products and Merchants[C]// Proceedings of the International Conference on Weblogs and Social Media, Washington, DC, USA. 2010.
[23] Tumasjan A, Sprenger T O, Sandner P G, et al.Election Forecasts With Twitter: How 140 Characters Reflect the Political Landscape[J]. Social Science Computer Review, 2011, 29(4): 402-418.
doi: 10.1177/0894439310386557
[24] Asur S, Huberman B A.Predicting the Future with Social Media[C]//Proceedings of the Web Intelligence and Intelligent Agent Technology (WI-IAT). 2010: 492-499.
[25] Schramek D, Leibbrandt A, Sigl V, et al.Osteoclast Differentiation Factor RANKL Controls Development of Progestin-driven Mammary Cancer[J]. Nature, 2010, 468(7320): 98-102.
doi: 10.1038/nature09387 pmid: 3084017
[26] Li F, Liu N, Jin H, et al.Incorporating Reviewer and Product Information for Review Rating Prediction[C]// Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2011: 1820-1825.
[27] Blair-Goldensohn S, Hannan K, McDonald R, et al. Building a Sentiment Summarizer for Local Service Reviews[C]// Proceedings of the WWW2008 Workshop: NLP in the Information Explosion Era. 2008: 339-348.
[28] Loughran T, McDonald B. When is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks[J]. The Journal of the American Finance Association, 2011, 41(1): 57-59.
doi: 10.1111/j.1540-6261.2010.01625.x
[29] Ding X, Liu B, Yu P S.A Holistic Lexicon-based Approach to Opinion Mining[C]// Proceedings of the International Conference on Web Search and Web Data Mining, Palo Alto, California, USA. 2008: 231-240.
[30] Polanyi L, Zaenen A.Contextual Valence Shifters[A]// Computing Attitude and Affect in Text: Theory and Applications[M]. Springer Netherlands, 2006: 1-10.
[31] Morante R, Schrauwen S, Daelemans W.Annotation of Negation Cues and Their Scope[R]. University of Antwerp, 2011.
[32] Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26: 3111-3119.
[33] Duan W, Yu Y, Cao Q, et al.Exploring the Impact of Social Media on Hotel Service Performance: A Sentimental Analysis Approach[J]. Cornell Hospitality Quarterly, 2016, 109(3): 527-538.
doi: 10.1177/1938965515620483
[34] Liao T F.Interpreting Probability Models : Logit, Probit, and Other Generalized Linear Models[M]. Sage Pubn Inc, 1994.
[35] 袁文平, 韩晶晶. SERVQUAL模型应用研究——以武汉某大酒店为例[J]. 今日财富: 金融发展与监管, 2011(9): 203-205.
[35] (Yuan Wenping, Han Jingjing.The Application of SERVQUAL Model——A Case Study of a Hotel in Wuhan[J]. Wealth Today: Financial Development and Supervision, 2011(9): 203-205.)
[36] Liu S, Law R, Rong J, et al.Analyzing Changes in Hotel Customers’ Expectations by Trip Mode[J]. International Journal of Hospitality Management, 2013, 34(1): 359-371.
doi: 10.1016/j.ijhm.2012.11.011
[37] 熊伟, 许俊华. 基于内容分析法的我国经济型酒店服务质量评价研究——兼与高星级酒店相对比[J]. 北京第二外国语学院学报, 2010, 32(11): 57-67.
doi: 10.3969/j.issn.1003-6539.2010.11.009
[37] (Xiong Wei, Xu Junhua.A Evaluation on Service Quality of Economic Hotel in China: A Content Analysis of Guest Comments on Website[J]. Journal of Beijing International Studies University, 2010, 32(11): 57-67.)
doi: 10.3969/j.issn.1003-6539.2010.11.009
[38] Tsaur S H, Lin Y C.Promoting Service Quality in Tourist Hotels: The Role of HRM Practices and Service Behavior[J]. Tourism Management, 2004, 25(4): 471-481.
doi: 10.1016/S0261-5177(03)00117-1
[39] Pine R, Phillips P.Performance Comparisons of Hotels in China[J]. International Journal of Hospitality Management, 2005, 24(1): 57-73.
doi: 10.1016/j.ijhm.2004.04.004
[1] 钟佳娃,刘巍,王思丽,杨恒. 文本情感分析方法及应用综述*[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[2] 刘彤,刘琛,倪维健. 多层次数据增强的半监督中文情感分析方法*[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[3] 王雨竹,谢珺,陈波,续欣莹. 基于跨模态上下文感知注意力的多模态情感分析 *[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[4] 常城扬,王晓东,张胜磊. 基于深度学习方法对特定群体推特的动态政治情感极性分析*[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[5] 张梦瑶, 朱广丽, 张顺香, 张标. 基于情感分析的微博热点话题用户群体划分模型 *[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
[6] 韩普, 张伟, 张展鹏, 王宇欣, 方浩宇. 基于特征融合和多通道的突发公共卫生事件微博情感分析*[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[7] 吕华揆,刘政昊,钱宇星,洪旭东. 异质性财经新闻与股市关系研究*[J]. 数据分析与知识发现, 2021, 5(1): 99-111.
[8] 徐红霞,于倩倩,钱力. 基于主题模型和情感分析的话题交互数据观点对抗性分析 *[J]. 数据分析与知识发现, 2020, 4(7): 110-117.
[9] 姜霖,张麒麟. 基于引文细粒度情感量化的学术评价研究*[J]. 数据分析与知识发现, 2020, 4(6): 129-138.
[10] 石磊,王毅,成颖,魏瑞斌. 自然语言处理中的注意力机制研究综述*[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[11] 李铁军,颜端武,杨雄飞. 基于情感加权关联规则的微博推荐研究*[J]. 数据分析与知识发现, 2020, 4(4): 27-33.
[12] 沈卓,李艳. 基于PreLM-FT细粒度情感分析的餐饮业用户评论挖掘[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[13] 薛福亮,刘丽芳. 一种基于CRF与ATAE-LSTM的细粒度情感分析方法*[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[14] 张翼鹏,马敬东. 突发公共卫生事件误导信息受众情感分析及传播特征研究*[J]. 数据分析与知识发现, 2020, 4(12): 45-54.
[15] 谭荧,张进,夏立新. 社交媒体情境下的情感分析研究综述[J]. 数据分析与知识发现, 2020, 4(1): 1-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn