Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (9): 115-123    DOI: 10.11925/infotech.2096-3467.2018.1429
Current Issue | Archive | Adv Search |
Extracting Emotion Tags from Comments of Microblog Commodities
Bocheng Li1,Yunqiu Zhang1(),Kaixi Yang2
1 College of Public Health, Jilin University, Changchun 130021, China
2 International School of Information Science & Engineering, Dalian University of Technology, Dalian 116620, China;
Download: PDF(646 KB)   HTML ( 15
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new method to collect emotion tags from microblog comments, aiming to improve the performance of feature-level data extraction. [Methods] First, we divided the evaluation units and extracted the explicit tags based on the dependency parsing and the extraction rules. Then, we revealed the implicit expression relationship in comments with the NodeRank algorithm. Finally, we retrieved the implicit tags to improve the accuracy of emotion tag retrieval. [Results] We examined the proposed method with the real online comments. The overall precision of the method was 83.6%, the recall rate was 87.1%, and the F value was 85.3%, which were better than the traditional methods. [Limitations] We did not fully utilize users’ general emotional expressions. [Conclusions] The proposed method based on dependency parsing and NodeRank algorithm can extract emotion tags effectively.

Key wordsOpinion Mining      Dependency Syntax Analysis      NodeRank Algorithms      Microblog Emotional Tags     
Received: 19 December 2018      Published: 23 October 2019
:  TP391.1  

Cite this article:

Bocheng Li,Yunqiu Zhang,Kaixi Yang. Extracting Emotion Tags from Comments of Microblog Commodities. Data Analysis and Knowledge Discovery, 2019, 3(9): 115-123.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.1429     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I9/115

依存关系 关系类型 示例
SBV 主谓关系 做工细腻(做工?细腻)
ATT 定中关系 是个好手表(手表?好)
COO 并列关系 外观和手感(外观?手感)
VOB 动宾关系 可惜没有蜂窝网络(没有?网络)
ADV 状中关系 外观非常漂亮(非常?漂亮)
CMP 动补关系 心率监测一般(一般?监测)
HED 核心关系 整句的核心
依存关系 提取规则 说明
SBV 提取<SBV的修饰词, SBV核心词依存链扩展, SBV的核心词> 此时句中的主语为SBV的修饰词, 谓语为SBV的核心词
SBV+VOB 提取<SBV的修饰词, VOB全依存链扩展, VOB全依存链> 此时句中谓语既是SBV的核心词同时也是VOB的核心词
SBV+CMP 提取<SBV的修饰词, CMP全依存链扩展, CMP全依存链> 此时句中谓语既是SBV的核心词同时也是CMP的核心词
SBV+COO 提取<SBV的修饰词, SBV核心词依存链扩展, SBV的核心词>;
<COO的修饰词, SBV核心词依存链扩展, SBV的核心词>
此时句中的主语由SBV的修饰词和COO的修饰词共同 构成
VOB 提取<VOB的修饰词, VOB的核心词依存链扩展, VOB的核心词> 若VOB修饰词的POS=n或j且核心词的POS=v
VOB 提取<VOB的核心词, VOB的修饰词依存链扩展, VOB的修饰词> 若VOB修饰词的POS=a或b或i且核心词的POS=v
CMP 提取<CMP的核心词, CMP的修饰词依存链扩展, CMP的修饰词>
智能手表 手机
情感词 特征词 NR值 情感词 特征词 NR值
表盘 0.0053 流畅 系统 0.0051
好看 外观 0.0053 耐用 电池 0.0051
划痕 屏幕 0.0056 漏光 屏幕 0.0051
瑕疵 手表 0.0056 刺眼 屏幕 0.0054
LOW 表带 0.0062 划痕 屏幕 0.0054
轻便 佩戴 0.0062 抗用 电池 0.0067
透气 表带 0.0062 噪音 通话 0.0067
柔软 表带 0.0068 卡顿 系统 0.0073
捂汗 表带 0.0068 黑点 屏幕 0.0073
炫酷 外观 0.0068 沾指纹 背壳 0.0085
掉皮 表带 0.0075 浴霸 摄像头 0.0085
漂亮 外观 0.0075 噪点 相机 0.0085
黑点 屏幕 0.0083 美轮美奂 颜色 0.0093
透汗 表带 0.0083 杠杠的 质量 0.0093
省电 电池 0.0088 价格 0.0102
友好 系统 0.0088 手机 0.0102
抗用 电池 0.0096 毛刺 中框 0.0115
时尚 外观 0.0096 掉漆 手机 0.0115
迟钝 系统 0.0112 清透 屏幕 0.0115
数据集 P R F值
智能手表 显式标签 86.9% 87.1% 87.0%
隐式标签 76.3% 86.1% 80.9%
总体 82.7% 86.7% 84.7%
手机 显式标签 88.9% 88.0% 88.4%
隐式标签 76.5% 86.2% 81.1%
总体 84.3% 87.4% 85.8%
SUM 显式标签 88.0% 87.6% 87.8%
隐式标签 76.4% 86.1% 81.0%
总体 83.6% 87.1% 85.3%
本文方法 文献[22]方法
P R F值 P R F值
显式标签 88.0% 87.6% 87.8% 82.8% 82.4% 82.6%
隐式标签 76.4% 86.1% 81.0% 72.2% 81.3% 76.5%
总体 83.6% 87.1% 85.3% 78.7% 82.0% 80.3%
[1] ( 中国互联网络信息中心. 第43次《中国互联网络发展状况统计报告》[R/OL].(2019-02-28). [2019-03-02]. http://cnnic.cn/gywm/xwzx/rdxw/20172017_7056/201902/t20190228_70643.htm.
[1] ( (China Internet Information Center. The 43rd China Internet Development Statistics Report [R/OL].(2019-02-28). [2019-03-02]. http://cnnic.cn/gywm/xwzx/rdxw/20172017_7056/201902/t20190228_70643.htm. )
[2] 唐晓波, 王洪艳 . 微博产品评论挖掘模型研究[J]. 情报杂志, 2013,32(2):107-111, 127.
[2] ( Tang Xiaobo, Wang Hongyan . Research on Microblogging Product Reviews Mining Model[J]. Journal of Intelligence, 2013,32(2):107-111, 127.)
[3] Liu B . Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data[M]. The 2nd Edition. Berlin: Springer, 2011: 459-460.
[4] 史伟, 王洪伟, 何绍义 . 基于微博的产品评论挖掘: 情感分析的方法[J]. 情报学报, 2014,33(12):1311-1321.
[4] ( Shi Wei, Wang Hongwei, He Shaoyi . Product Reviews Mining from Microblogging Based on Sentiment Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(12):1311-1321.)
[5] 丁晟春, 吴靓婵媛, 李红梅 . 基于SVM的中文微博观点倾向性识别[J]. 情报学报, 2016,35(12):1235-1243.
[5] ( Ding Shengchun, Wu Jingchanyuan, Li Hongmei . Chinese Micro-blogging Opinion Recognition Based on SVM Model[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(12):1235-1243.)
[6] 胡默之 . 中文微博观点句识别及评价对象抽取[D]. 上海:上海交通大学, 2015.
[6] ( Hu Mozhi . Recognition of Chinese Microblog Sentiment Polarity and Extraction of Opinion Target[D]. Shanghai: Shanghai Jiaotong University, 2015.)
[7] 鲍佳娜 . 基于汽车领域的中文微博意见挖掘研究[D]. 上海:上海交通大学, 2014.
[7] ( Bao Jia’na . A Study on Chinese Microblog Opinion Mining Based on Automobile Domain[D]. Shanghai: Shanghai Jiaotong University, 2014.)
[8] 牟彦霏 . 基于主题的微博小句内评价对象与评价词分析[J]. 现代语文, 2016(3):114-116.
[8] ( Mou Yanfei . Analysis of Objects and Evaluation Words in Microblog Clause Based on Topic[J]. Modern Chinese, 2016(3):114-116.)
[9] 丁晟春, 孟美任, 李霄 . 面向中文微博的观点句识别研究[J]. 情报学报, 2014,33(2):175-182.
[9] ( Ding Shengchun, Meng Meiren, Li Xiao . Study of Subjective Sentence Identification Oriented to Chinese Microblog[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(2):175-182.
[10] 邱云飞, 倪学峰, 邵良杉 . 商品隐式评价对象提取的方法研究[J]. 计算机工程与应用, 2015,51(19):114-118.
[10] ( Qiu Yunfei, Ni Xuefeng, Shao Liangshan . Research on Extracting Method of Commodities Implicit Opinion Tar-gets[J]. Computer Engineering and Applications, 2015,51(19):114-118.)
[11] Nikfarjam A, Emadzadeh E, Gonzalez G . A Hybrid System for Emotion Extraction from Suicide Notes[J]. Biomedical Informatics Insights, 2012,5(S1):165-174.
[12] Feng C, Liao C, Liu Z, et al. A Hybrid Method of Sentiment Key Sentence Identification Using Lexical Semantics and Syntactic Dependencies [C]// Proceedings of the 2014 Asia-Pacific Web Conference. Springer, 2014: 11-22.
[13] 董丽丽, 赵繁荣, 张翔 . 基于领域本体、情感词典的商品评论倾向性分析[J]. 计算机应用与软件, 2014,31(12):104-108.
[13] ( Dong Lili, Zhao Fanrong, Zhang Xiang . Analysing Propensity of Product Reviews Based on Domain Ontology and Sentiment Lexicon[J]. Computer Applications and Software, 2014,31(12):104-108.)
[14] Wang F, Xu Y, Wu Y , et al. Predicting the Semantic Orientation of Emoticons[J]. Journal of Computational Information Systems, 2013,9(6):2391-2398.
[15] 兰天, 郭躬德 . 基于词共现和情感元素的突发话题检测算法[J]. 计算机系统应用, 2016,25(8):101-108.
[15] ( Lan Tian, Guo Gongde . Bursty Topic Detection Based on Word Co-Occurrence and Emotions[J]. Computer Systems and Applications, 2016,25(8):101-108.)
[16] Popescu A M, Etzioni O . Extracting Product Features and Opinions from Reviews[A]// Kao A, Poteet S R. Natural Language Processing and Text Mining[M]. 2007: 9-28.
[17] Kumar V R, Raghuveer K . Dependency Driven Semantic Approach to Product Features Extraction and Summarization Using Customer Reviews[J]. Advances in Intelligent Systems & Computing, 2013,178:225-238.
[18] 夏梦南, 杜永萍, 左本欣 . 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报:理学版, 2014,49(11):22-30.
[18] ( Xia Mengnan, Du Yongping, Zuo Benxin . Micro-blog Opinion Analysis Based on Syntactic Dependency and Feature Combination[J]. Journal of Shandong University: Science Edition, 2014,49(11):22-30.)
[19] 聂卉, 杜嘉忠 . 依存句法模板下的商品特征标签抽取研究[J]. 现代图书情报技术, 2014(12):44-50.
[19] ( Nie Hui, Du Jiazhong . Using Dependency Parsing Pattern to Extract Product Feature Tags[J]. New Technology of Library and Information Service, 2014(12):44-50.)
[20] 吴青林, 周天宏 . 基于话题聚类及情感强度的中文微博舆情分析[J]. 情报理论与实践, 2016,39(1):109-112.
[20] ( Wu Qinglin, Zhou Tianhong . Analysis of the Public Opinion of Chinese Microblog Based on Topic Clustering and Emotional Intensity[J]. Information Studies: Theory & Practice, 2016,39(1):109-112.)
[21] 李光敏, 陈炽, 邢江 , 等. 网络文本评论中产品特征抽取综述[J]. 现代情报, 2016,36(8):168-173.
[21] ( Li Guangmin, Chen Chi, Xing Jiang , et al. Overview of Extracting Product Feature from Text Reviews[J]. Journal of Modern Information, 2016,36(8):168-173.)
[22] 唐晓波, 兰玉婷 . 基于特征本体的微博产品评论情感分析[J]. 图书情报工作, 2016,60(16):121-127.
[22] ( Tang Xiaobo, Lan Yuting . Sentiment Analysis of Microblog Product Reviews Based on Feature Ontology[J]. Library and Information Service, 2016,60(16):121-127.)
[23] Tuarob S, Tucker C S. A Product Feature Inference Model for Mining Implicit Customer Preferences Within Large Scale Social Media Networks [C]// Proceedings of the ASME 2015 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. 2015.
[24] 刘志明, 于波, 欧阳纯萍 , 等. 基于主题的SE-TextRank情感摘要方法[J]. 情报工程, 2017,3(3):97-104.
[24] ( Liu Zhiming, Yu Bo, Ouyang Chunping , et al. SE-TextRank Opinion Summarization Method Based on Topic Model[J]. Technology Intelligence Engineering, 2017,3(3):97-104.)
[25] 何炎祥, 刘健博, 孙松涛 , 等. 基于层叠条件随机场的微博商品评论情感分类[J]. 山东大学学报: 理学版, 2015,50(11):67-73.
[25] ( He Yanxiang, Liu Jianbo, Sun Songtao , et al. Product Reviews Sentiment Classification in Micro-blog Based on Cascaded Conditional Random Field[J]. Journal of Shandong University: Science Edition, 2015,50(11):67-73.)
[26] Li P, Qiu X. NodeRank: An Algorithm to Assess State Enumeration Attack Graphs [C]// Proceedings of the 8th International Conference on Wireless Communications, Networking and Mobile Computing. IEEE, 2012.
[27] 周立欣, 林杰 . 基于NodeRank算法的产品特征提取研究[J]. 数据分析与知识发现, 2018,2(4):90-98.
[27] ( Zhou Lixin, Lin Jie . Extracting Product Features with NodeRankAlgorithm[J]. Data Analysis and Knowledge Discovery, 2018,2(4):90-98.)
[1] Weiqing Li,Weijun Wang. Building Product Feature Dictionary with Large-scale Review Data[J]. 数据分析与知识发现, 2018, 2(1): 41-50.
[2] Bo Guo,Shouguang Li,Hao Wang,Xiaojun Zhang,Wei Gong,Zhaojun Yu,Yu Sun. Examining Product Reviews with Sentiment Analysis and Opinion Mining[J]. 数据分析与知识发现, 2017, 1(12): 1-9.
[3] Zhang Li, Xu Xin. Implicit Feature Identification in Product Reviews[J]. 现代图书情报技术, 2015, 31(12): 42-47.
[4] You Guirong, Wu Wei, Qian Yuntao. Feature Extraction Method for Detecting Spam in Electronic Commerce[J]. 现代图书情报技术, 2014, 30(10): 93-100.
[5] Li Gang,Chen Jing,Cheng Mingjie,Kou Guangzeng. Study on the City Image Network Monitoring System Based on Opinion-mining[J]. 现代图书情报技术, 2010, 26(2): 56-62.
[6] Yu Chuanming. Mining Product Aspects from User Reviews----An SOM-based Approach[J]. 现代图书情报技术, 2009, 25(5): 61-66.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn