Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (10): 1-11    DOI: 10.11925/infotech.2096-3467.2017.0338
Orginal Article Current Issue | Archive | Adv Search |
Analyzing Sentiment Polarity of Comments Based on Attributes
Li Hui, Chai Yaqing()
School of Economics and Management, Xidian University, Xi’an 710126, China
Download: PDF (804 KB)   HTML ( 4
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This article tries to quantitatively study the sentiment polarity of online comments base on the targets’ attributes. [Methods] First, we analyzed the comments by their objects, attributes and contents. Then, we extracted the attribute words and the corresponding comment sets. Third, we introduced the attribute factors and calculated their values with the modified TFIDF formula. Finally, we developed a quantitative analysis algorithm based on the attribute features with Python. [Results] Compared to the traditional machine learning classification algorithms (e.g., NB and SVM), our method improved the accuracy of sentiment classification, when the attribute factor was set to equal weight. [Limitations] The comments selection method and the coefficients parameters of the proposed algorithm need to be improved. [Conclusions] Our method could effectively improve the accuracy of the sentiment classification.

Key wordsComment Text      Attribute Factor      Comment Mode      Sentiment Polarity     
Received: 26 April 2017      Published: 08 November 2017
ZTFLH:  G250  

Cite this article:

Li Hui,Chai Yaqing. Analyzing Sentiment Polarity of Comments Based on Attributes. Data Analysis and Knowledge Discovery, 2017, 1(10): 1-11.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0338     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I10/1

第一个词 第二个词 第三个词
模式1 JJ NN, NNS anything
模式2 RB, RBR, or RBS JJ not NN or NNS
模式3 NN or NNS JJ not NN or NNS
模式4 JJ JJ not NN or NNS
模式5 RB, RBR, or RBS VB, VBD, VBN,
or VBG
anything
类型 连词
转折 但是、偏偏、只是、不过、至于、不料、岂知、虽然、然而、而、即使、但、可是、不过、却
递进 而且、更、更加、并、甚至、不如、不及、乃至、并且、况、况且、何况
数据集 标注好的正面
评论数目
标注好的负面
评论数目
chnSenticorp2000 1 000 1 000
chnSenticorp4000 2 000 2 000
chnSenticorp6000 3 000 3 000
chnSenticorp10000 3 000 7 000
属性(Feature) 属性词
F1: 环境 风景、环境、氛围、外观、外表、条件、卫生、空气、酒店环境、酒店氛围、宾馆、周围、周围环境、周边环境、大堂、大堂环境、外观、门面、室内环境、室内、屋内、房子、房间、楼道、走廊、气味、味道、霉味、油漆味、烟味、噪音、噪声
F2: 设施 设施、设计、风格、配套、设备、设置、布置、装置、配备、装备、内饰、内里、建筑、格局、硬件、硬件设施、软件、软件设施、装修、卧具、家具、电梯、客房、标准间、房间面积、房间大小、光线、空间、电视、网络、网速、上网、宽带、空调、墙壁、墙纸、床、毛巾、床单、被罩、被褥、地毯、地板、地面、卫生间、洗手间、厕所、浴室、淋浴、浴缸、热水、洗澡、洗漱用品、个人用品、房间隔音、隔音、停车场、停车、周围设施、通风
F3: 餐饮 餐饮、就餐、餐厅、饭菜、上菜、点餐、叫餐、早餐、早茶、早点、早饭、自助餐、下午茶、饮食、味道、品种、种类、吃饭
F4: 交通 交通、周围交通、路线、出行、外出、打车、进出、购物、景点
F5: 服务 服务态度、态度、表情、语气、口气、服务意识、服务员态度、服务、服务水平、素质、服务素质、前台、服务员、门童、服务生、前台服务、酒店服务、管理、退房、客服
F6: 价格 价格、收费、价钱、价位、性价比、房价、结账、账单、手续
F7: 位置 地理位置、位置、地位、地点、地方、地段、场所、火车站、机场
情感词典 积极词汇 消极词汇 总数
HowNet 4 566 4 370 8 851
NTUSD 2 846 8 325 10 027
Correct label
True False
Positive TP(True Positive) FP(False Positive)
Negative TN(True Negative) FN(False Negative)
属性 环境 设施 餐饮 交通 服务 价格 位置
属性因子 0.501406 0.042195 0.029424 0.005845 0.389272 0.019860 0.011962
属性因子(对照) 0.142857 0.142857 0.142857 0.142857 0.142857 0.142857 0.142857
评论序列 预处理后的评论 提取属性情感对 POS标注 计算情感极值 情感分类 备注
Comment1 |风景还算不错|酒店早餐很难吃 <风景, 不错, 还算>
<早餐, 难吃, 很>
-0.305203935 N 1表示无
程度副词
Comment2 |房间家具太差|早餐质量太差|环境好但交通太差 <家具, 大, 1>
<早餐, 差, 太>
<环境, 好, 1>
<交通, 差, 太>
但: 转折连词 -1.532515171 N
Comment3 |但房间里的淋浴设施不好|前台小姐服务很不好|服务意识太差 <设施, 不好, 1>
<服务, 不好, 很>
<服务意思, 差, 太>
-2.035849256 N
Comment4 |环境比较温馨|房间比较干净|卫生间设施较完备 <环境, 温馨, 比较>
<房间, 干净, 比较>
<设施, 完善, 较>
0.709084635 P
Comment5 |虽然房间的条件略显简陋|但环境、服务还有饭菜都还是很不错的 <条件, 简陋, 1>
<环境, 不错, 很>
<服务, 不错, 很>
<饭菜, 不错, 很>
但: 转者连词 0.405485228 P
语料库 Accuracy
属性因子
等权重
传统分类方法 本文
算法
NB SVM
chnSenticorp2000 88.33% 0.791 0.879 89.23%
chnSenticorp4000 89.56% 0.832 0.881 89.90%
chnSenticorp6000 90.01% 0.854 0.908 91.45%
chnSenticorp10000 91.59% 0.873 0.911 92.88%
语料库 F1
属性因子
等权重
传统分类方法 本文
算法
NB SVM
chnSenticorp2000 80.32% 0.732 0.793 81.13%
chnSenticorp4000 80.57% 0.792 0.801 82.60%
chnSenticorp6000 82.31% 0.801 0.818 84.25%
chnSenticorp10000 82.69% 0.809 0.821 85.19%
[1] 孟园, 王洪伟, 王伟. 网络口碑对产品销量的影响: 基于细粒度的情感分析方法[J]. 管理评论, 2017, 29(1): 144-154.
[1] (Meng Yuan, Wang Hongwei, Wang Wei.The Effect of Electronic Word-of-Mouth on Sales Through Fine-Gained Sentiment Analysis[J]. Management Review, 2017, 29(1): 144-154. )
[2] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA. 2004: 168-177.
[3] Ma B, Zhang D, Yan Z, et al.An LDA and Synonym Lexicon Based Approach to Product Feature Extraction from Online Consumer Product Reviews[J]. Journal of Electronic Commerce Research, 2013, 14(4): 304-314.
doi: 10.3846/16111699.2012.665383
[4] 周清清, 章成志. 在线用户评论细粒度属性抽取[J]. 情报学报, 2017, 36(5): 484-493.
doi: 10.3772/j.issn.1000-0135.2017.05.006
[4] (Zhou Qingqing, Zhang Chengzhi.Fined-Grained Aspect Extraction from Online Customer Reviews[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(5): 484-493. )
doi: 10.3772/j.issn.1000-0135.2017.05.006
[5] 娄德成, 姚天昉. 汉语句子语义极性分析和观点抽取方法的研究[J]. 计算机应用, 2006, 26(11): 2622-2625.
[5] (Lou Decheng, Yao Tianfang.Semantic Polarity Analysis and Opinion on Chinese Review Sentences[J]. Computer Applications, 2006, 26(11): 2622-2625. )
[6] Lazaridou A, Titov I, Sporleder C.A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013: 1630-1639.
[7] Xu L, Liu K, Lai S, et al.Walk and Learn: A Two-Stage Approach for Opinion Words and Opinion Targets Co-Extraction[C]// Proceedings of the 22nd International Conference on World Wide Web. ACM, 2013: 95-96.
[8] 江腾蛟, 万常选, 刘德喜, 等. 基于语义分析的评价对象-情感词对抽取[J]. 计算机学报, 2017, 40(3): 617-633.
doi: 10.11897/SP.J.1016.2017.00617
[8] (Jiang Tengjiao, Wan Changxuan, Liu Dexi, et al.Extracting Target-Opinion Pairs Based on Semantic Analysis[J]. Chinese Journal of Computers, 2017, 40(3): 617-633. )
doi: 10.11897/SP.J.1016.2017.00617
[9] 靳亚辉. 基于属性集合的产品评论挖掘研究[D]. 武汉: 华中科技大学, 2011.
[9] (Jin Yahui.Product Review Mining Based on Feature Set[D]. Wuhan: Huazhong University of Science and Technolgy, 2011. )
[10] Parkhe V, Biswas B.Sentiment Analysis of Movie Reviews: Finding Most Important Movie Aspects Using Driving Factors[J]. Soft Computing, 2016, 20(9): 1-7.
doi: 10.1007/s00500-015-1933-9
[11] 王文华, 朱艳辉, 徐叶强, 等. 基于SVM的产品评论属性特征的情感倾向分析[J]. 湖南工业大学学报, 2012, 26(5): 76-80.
[11] (Wang Wenhua, Zhu Yanhui, Xu Yeqiang, et al.Analysis on Emotional Tendency of Attribute Characteristics in Product Reviews Based on SVM[J]. Journal of Hunan University of Technology, 2012, 26(5): 76-80. )
[12] 王伟, 王洪伟. 特征观点对购买意愿的影响: 在线评论的情感分析方法[J]. 系统工程理论与实践, 2016, 36(1): 63-76.
[12] (Wang Wei, Wang Hongwei.The Influence of Aspect-Based Opinions on User’s Purchase Intention Using Sentiment Analysis of Online Reviews[J]. Systems Engineering——Theory & Practice, 2016, 36(1): 63-76. )
[13] Yang K, Cai Y, Huang D, et al. An Effective Hybrid Model for Opinion Mining and Sentiment Analysis[C]// Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing (BigComp). 2017.
[14] Hu M, Liu B.Mining Opinion Features in Customer Reviews[C]//Proceedings of the 19th National Conference on Artifical Intelligence. 2004: 755-760.
[15] 陈贤. 中文网络评论的情感倾向性分析研究[D]. 北京: 北京邮电大学, 2014.
[15] (Chen Xian.Research on Chinese Online Reviews Sentiment Classification [D]. Beijing: Beijing University of Posts and Telecommunications, 2014. )
[16] 陆叶, 张晓如. 基于语义文法的属性知识获取[J]. 信息技术, 2017, 41(2): 38-42.
[16] (Lu Ye, Zhang Xiaoru.Acquiring Attributes Knowledge Based on Semantic Grammar[J]. Information Technology, 2017, 41(2): 38-42. )
[17] Turney P D.Thumbs up or Thumbs down?: Semantic Orientation Applied to Unsupervised Classification of Reviews[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002: 417-424.
[18] Quan C, Ren F.Feature-level Sentiment Analysis by Using Comparative Domain Corpora[J]. Enterprise Information Systems, 2016, 10(5): 505-522.
doi: 10.1080/17517575.2014.985613
[19] 蔺璜, 郭姝慧. 程度副词的特点范围与分类[J]. 山西大学学报: 哲学社会科学版, 2003, 26(2): 71-74.
[19] (Lin Huang, Guo Shuhui.On the Characteristics Range and Classification of Degree[J]. Journal of Shanxi University: Philosophy & Science, 2003, 26(2): 71-74. )
[20] 刘玉娇, 琚生根, 伍少梅, 等. 基于情感字典与连词结合的中文文本情感分类[J]. 四川大学学报: 自然科学版, 2015, 52(1): 57-62.
doi: 10.3969/j.issn.0490-6756.2015.01.012
[20] (Liu Yujiao, Ju Shenggen, Wu Shaomei, et al.Classification of Chinese Texts Sentiment Based on Semantic and Conjunction[J]. Journal of Sichuan University: Natural Science Edition, 2015, 52(1): 57-62. )
doi: 10.3969/j.issn.0490-6756.2015.01.012
[21] Chen T, Xu R, He Y, et al.Learning User and Product Distributed Representations Using a Sequence Model for Sentiment Analysis[J]. IEEE Computational Intelligence Magazine, 2016, 11(3): 34-44.
doi: 10.1109/MCI.2016.2572539
[1] Yu Wang,Xiuxiu Li. Evaluating Business Reputation with E-Commerce Comments[J]. 数据分析与知识发现, 2017, 1(8): 59-67.
[2] Ge Gao,Junmei Luo,Yu Wang. Analyzing Textual Sentiment Based on HNC Theory[J]. 数据分析与知识发现, 2017, 1(8): 85-91.
[3] Weifang Wu,Baojun Gao,Haixia Yang,Hanlin Sun. The Impacts of Reviews on Hotel Satisfaction: A Sentiment Analysis Method[J]. 数据分析与知识发现, 2017, 1(3): 62-71.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn