Please wait a minute...
Advanced Search
数据分析与知识发现  2020, Vol. 4 Issue (2/3): 207-213    DOI: 10.11925/infotech.2096-3467.2019.0678
  专辑 本期目录 | 过刊浏览 | 高级检索 |
一种基于CRF与ATAE-LSTM的细粒度情感分析方法*
薛福亮(),刘丽芳
天津财经大学商学院 天津 300222
Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM
Xue Fuliang(),Liu Lifang
Business School, Tianjin University of Finance & Economics, Tianjin 300222, China
全文: PDF(831 KB)   HTML ( 7
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 应用细粒度情感分析方法提取产品属性及情感,进而将属性词聚类到属性面,分析用户在产品属性面的情感。【方法】 通过CRF抽取产品属性词,利用基于注意力机制的长短期记忆网络做属性情感分析,最后基于Word2Vec将属性词聚集为属性面,并分析电商平台产品属性面的情感。【结果】 CRF抽取属性词的F1值为0.76,ATAE-LSTM属性情感分析的F1值为0.78。【局限】 只抽取显式属性词,对隐式属性词抽取效果较差;数据集偏小。【结论】 通过对属性词的抽取、情感分析以及属性面聚类,可较好地解释用户对产品的属性偏好。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
薛福亮
刘丽芳
关键词 CRF长短期记忆网络注意力机制情感分析Word2Vec    
Abstract

[Objective] This paper tries to extract product attributes, aiming to cluster these words and analyze user’s sentiments.[Methods] Firstly, we identified the attributes of products with CRF technique. Then, we analyzed the sentiment of extracted terms with attention-based LSTM. Finally, we clustered these terms into appropriate categories with the help of Word2Vec and conducted fine-grained sentiment analysis of the products.[Results] The F1 values of term extraction and sentiment analysis were 0.76 and 0.78.[Limitations] We only retrieved explicit terms for this study and the sample size needs to be expanded.[Conclusions] The proposed method could effectively explore user’s preference in products.

Key wordsCRF    LSTM    Attention Mechanism    Sentiment Analysis    Word2Vec
收稿日期: 2019-06-14     
中图分类号:  TP391  
通讯作者: 薛福亮     E-mail: fuliangxue@163.com
引用本文:   
薛福亮,刘丽芳. 一种基于CRF与ATAE-LSTM的细粒度情感分析方法*[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
Xue Fuliang,Liu Lifang. Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM. Data Analysis and Knowledge Discovery, DOI:10.11925/infotech.2096-3467.2019.0678.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2019.0678
图1  研究框架
图2  ATAE-LSTM结构
图3  Skip-gram模型
评论 Pos Tag 评论 Pos Tag
运行/ v B 也/ d O
速度/ n I 很/ d O
快/ a O 耐用/ a O
,/ x O x O
电池/ n B
表1  属性词标记示例
图4  不同聚类数目K下的欧氏距离变化趋势
图5  属性词聚类结果
实验 P R F1
基于CRF抽取属性词
基于关联规则抽取属性词
0.84
0.48
0.70
0.14
0.76
0.21
基于ATAE-LSTM属性情感分析
基于LSTM属性情感分析
0.78
0.71
0.81
0.79
0.78
0.73
表2  实验结果
属性面 属性词 正面情感 中性情感 负面情感
设计 设计 75% 0 25%
外形与功能 信号 4% 96% 0
相机 75% 0 25%
外形 89% 0 11%
摄像头 9% 0 91%
速度 充电速度 100% 0 0
系统速度 100% 0 0
表3  部分属性面的情感指标
[1] Cheng Z, Ding Y, He X , et al. A^ 3NCF: An Adaptive Aspect Attention Model for Rating Prediction [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 3748-3754.
[2] Wang N, Wang H, Jia Y , et al. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data [C]// Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2018: 165-174.
[3] Hu M, Liu B . Mining and Summarizing Customer Reviews [C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
[4] Bafna K, Toshniwal D . Feature Based Summarization of Customers’ Reviews of Online Products[J]. Procedia Computer Science, 2013,22:142-151.
[5] Chen Z, Liu B . Mining Topics in Documents: Standing on the Shoulders of Big Data [C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014: 1116-1125.
[6] Hu Y, Boyd-Graber J, Satinoff B , et al. Interactive Topic Modeling[J]. Machine Learning, 2014,95(3):423-469.
[7] Lafierty J D, McCallum A, Pereira F C N . Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data [C]// Proceedings of the 18th International Conference on Machine Learning. Burlington, Massachusetts, USA: Morgan Kaufmann Publishers, 2001: 282-289.
[8] Huang S, Liu X, Peng X , et al. Fine-grained Product Features Extraction and Categorization in Reviews Opinion Mining [C]// Proceedings of the 12th International Conference on Data Mining Workshops. IEEE, 2012: 680-686.
[9] 郑丽娟, 王洪伟 . 基于情感本体的在线评论情感极性及强度分析:以手机为例[J]. 管理工程学报, 2017,31(2):47-54.
( Zheng Lijuan, Wang Hongwei . Sentimental Polarity and Strength of Online Cellphone Reviews Based on Sentiment Ontology[J]. Journal of Industrial Engineering and Engineering Management, 2017,31(2):47-54.)
[10] Manek A S, Shenoy P D, Mohan M C , et al. Aspect Term Extraction for Sentiment Analysis in Large Movie Reviews Using Gini Index Feature Selection Method and SVM Classifier[J]. World Wide Web-Internet & Web Information Systems, 2017,20(2):135-154.
[11] Akhtar M S, Gupta D, Ekbal A , et al. Feature Selection and Ensemble Construction: A Two-Step Method for Aspect Based Sentiment Analysis[J]. Knowledge-Based Systems, 2017,125:116-135.
[12] 李阳辉, 谢明, 易阳 . 基于深度学习的社交网络平台细粒度情感分析[J]. 计算机应用研究, 2017,34(3):743-747.
( Li Yanghui, Xie Ming, Yi Yang . Fine-grained Sentiment Analysis for Social Network Platform Based on Deep-learning Model[J]. Application Research of Computers, 2017,34(3):743-747.)
[13] Wu H, Gu Y, Sun S , et al. Aspect-based Opinion Summarization with Convolutional Neural Networks [C]// Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, 2016: 3157-3163.
[14] Xu L, Lin J, Wang L , et al. Deep Convolutional Neural Network Based Approach for Aspect-Based Sentiment Analysis[J]. Advanced Science and Technology Letters, 2017,143:199-204.
[15] Toh Z, Su J . NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis Using Neural Network Features [C]// Proceedings of the 10th International Workshop on Semantic Evaluation. 2016: 282-288.
[16] Peng H, Ma Y, Li Y , et al. Learning Multi-Grained Aspect Target Sequence for Chinese Sentiment Analysis[J]. Knowledge-Based Systems, 2018,148:167-176.
[17] Rush A M, Chopra S, Weston J . A Neural Attention Model for Abstractive Sentence Summarization [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 379-389.
[18] Hermann K M, Kocisky T, Grefenstette E , et al. Teaching Machines to Read and Comprehend [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015: 1693-1701.
[19] Wang Y, Huang M, Zhao L , et al. Attention-Based LSTM for Aspect-Level Sentiment Classification [C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 606-615.
[20] 彭敏, 席俊杰, 代心媛 , 等. 基于情感分析和LDA主题模型的协同过滤推荐算法[J]. 中文信息学报, 2017,31(2):194-203.
( Peng Min, Xi Junjie, Dai Xinyuan , et al. Collaborative Filtering Recommendation Based on Sentiment Analysis and LDA Topic Model[J]. Journal of Chinese Information Processing, 2017,31(2):194-203.)
[21] 李良强, 袁华, 叶开 , 等. 基于在线评论词向量表征的产品属性提取[J]. 系统工程学报, 2018,33(5):687-697.
( Li Liangqiang, Yuan Hua, Ye Kai , et al. Extraction Product Features from Online Reviews Based on Word-Vector-Representation[J]. Journal of Systems Engineering, 2018,33(5):687-697.)
[22] 王荣洋, 鞠久朋, 李寿山 , 等. 基于CRFs的评价对象抽取特征研究[J]. 中文信息学报, 2012,26(2):56-61.
( Wang Rongyang, Ju Jiupeng, Li Shoushan , et al. Feature Engineering for CRFs Based Opinion Target Extraction[J]. Journal of Chinese Information Processing, 2012,26(2):56-61.)
[23] Mikolov T, Sutskever I, Chen K , et al. Distributed Representations of Words and Phrases and Their Compositionality [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013: 3111-3119.
[1] 叶佳鑫,熊回香,蒋武轩. 一种融合患者咨询文本与决策机理的医生推荐算法*[J]. 数据分析与知识发现, 2020, 4(2/3): 153-164.
[2] 马建霞,袁慧,蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[3] 龚丽娟,王昊,张紫玄,朱立平. Word2Vec对海关报关商品文本特征降维效果分析*[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
[4] 谭荧,张进,夏立新. 社交媒体情境下的情感分析研究综述[J]. 数据分析与知识发现, 2020, 4(1): 1-11.
[5] 马娜,张智雄,吴朋民. 基于特征融合的术语型引用对象自动识别方法研究*[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[6] 聂卉,何欢. 引入词向量的隐性特征识别研究*[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
[7] 岑咏华,谭志浩,吴承尧. 财经媒介信息对股票市场的影响研究: 基于情感分析的实证 *[J]. 数据分析与知识发现, 2019, 3(9): 98-114.
[8] 卢伟聪,徐健. 基于三分网络的网络用户评论情感分析 *[J]. 数据分析与知识发现, 2019, 3(8): 10-20.
[9] 尤众喜,华薇娜,潘雪莲. 中文分词器对图书评论和情感词典匹配程度的影响 *[J]. 数据分析与知识发现, 2019, 3(7): 23-33.
[10] 吴粤敏,丁港归,胡滨. 基于注意力机制的农业金融文本关系抽取研究*[J]. 数据分析与知识发现, 2019, 3(5): 86-92.
[11] 朱笑笑,杨尊琦,刘婧. 基于Bi-LSTM和CRF的药品不良反应抽取模型构建*[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[12] 蒋翠清,郭轶博,刘尧. 基于中文社交媒体文本的领域情感词典构建方法研究*[J]. 数据分析与知识发现, 2019, 3(2): 98-107.
[13] 陶志勇,李小兵,刘影,刘晓芳. 基于双向长短时记忆网络的改进注意力短文本分类方法 *[J]. 数据分析与知识发现, 2019, 3(12): 21-29.
[14] 李钰曼,陈志泊,许福. 基于KACC模型的文本分类研究 *[J]. 数据分析与知识发现, 2019, 3(10): 89-97.
[15] 余丽,钱力,付常雷,赵华茗. 基于深度学习的文本中细粒度知识元抽取方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn