Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (4): 72-79     https://doi.org/10.11925/infotech.2096-3467.2020.1083
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于生成式对抗网络和评论专业类型的情感分类研究 *
李菲菲1,吴璠2,王中卿2()
1苏州大学图书馆 苏州 215006
2苏州大学计算机科学与技术学院 苏州 215006
Sentiment Analysis with Reviewer Types and Generative Adversarial Network
Li Feifei1,Wu Fan2,Wang Zhongqing2()
1Library of Soochow University, Suzhou 215006, China
2School of Computer Science and Technology, Soochow University, Suzhou 215006, China
全文: PDF (800 KB)   HTML ( 24
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 研究评论文本中专业评论家和普通观众表达情感的方式的差异,提高评论情感分类的准确率。【方法】 利用用户的专业类型辅助判断评论的情感极性,使用生成式对抗网络分析评论来自专业评论家还是普通观众,通过捕获两者在表达情感方式上的差异性,进一步提高评论情感分类的准确度。【结果】 实验证明,提出的基于生成式对抗网络和评论专业类型的情感分类模型GJOINT准确率达到0.836,比基准模型LSTM、BiLSTM分别提高了5.6%、4.4%。【局限】 实验数据集只选取电影评论数据集,在其他领域数据集上的有效性需要进一步验证。【结论】 提出的基于生成式对抗网络和评论专业类型的情感分类模型GJOINT能有效提高在线评论情感分类的效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李菲菲
吴璠
王中卿
关键词 情感分类生成式对抗网络评论专业类型LSTM    
Abstract

[Objective] This paper analyzes online comments by professional critics and average audience, aiming to improve the sentiment classification of reviews. [Methods] First, we introduced the professional backgrounds of contributors to examine the emotional polarity of reviews. Then, we used the generative adversarial network to decide whether the contributor was a professional critic or an average browser. Finally, we identified their differences to further improve the accuracy of emotion classification. [Results] The accuracy rate of the proposed model reached 83.6%, which was 5.6% higher than the benchmark model LSTM and 4.4% higher than BiLSTM. [Limitations] We only studied movie reviews, and more research is needed to evaluate our model with data sets from other fields. [Conclusions] The proposed GJOINT model can effectively improve the results of sentiment classification of online reviews.

Key wordsSentiment Analysis    Generative Adversarial Network    Review Professionalism    LSTM
收稿日期: 2020-11-02      出版日期: 2021-05-17
ZTFLH:  分类号: TP391  
基金资助:*国家自然科学基金项目的研究成果之一(61806137);国家自然科学基金项目的研究成果之一(61702351)
通讯作者: 王中卿     E-mail: wangzq@suda.edu.cn
引用本文:   
李菲菲,吴璠,王中卿. 基于生成式对抗网络和评论专业类型的情感分类研究 *[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network. Data Analysis and Knowledge Discovery, 2021, 5(4): 72-79.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.1083      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I4/72
序号 评论实例
[E1] Funny and charming, it’s a very good movie. (既有趣又迷人,是一部很好的电影。)
[E2] This version of Black Panther weaves a seemingly disparate set of strands together rather seamlessly, presenting more than enough comic book acrobatics and mayhem to satisfy the core fan base while also speaking a necessary degree of truth to power. (这个版本的黑豹把一组看似不同的线编织在一起,展现了足够多的漫画书,杂技和混乱,以满足核心粉丝群,同时也展示了必要的真实性。)
Table 1  评论实例
Fig.1  情感分类和评论专业类型分类联合模型
Fig.2  基于生成式对抗网络和评论专业类型的情感分类模型
实验模型 实验一准确率 实验二准确率
A C A C
SingleCategory 0.678 0.734 0.768 0.802
CrossCategory 0.648 0.623 0.746 0.706
MultiCategory 0.712 0.685 0.782 0.816
Table 2  专业类型不同的评论情感分类结果
模型名称 实验一准确率 实验二准确率
LSTM 0.687 0.780
BiLSTM 0.672 0.792
JOINT 0.703 0.806
GJOINT 0.716 0.836
Table 3  情感分类性能对比
评论内容 LSTM GJOINT
[E3] The most courageous thing about it, from today’s standards, is that it closes without an obligatory happy ending, and an audience that has watched for 187 minutes doesn’t get a tidy, mindless conclusion.
(从今天的标准来看,它最有勇气的一点是结束时没有一个必然的快乐结局,一个观看了187分钟的观众不会得到一个整洁、愚蠢的结论。)
Negative Positive
[E4] boring old hollywood style epic with that canned elevator music. still it was the best of its kind.
(无聊的老好莱坞风格史诗与罐装电梯音乐。尽管如此,它还是同类型中最好的。)
Negative Positive
Table 4  GJOINT模型有效性评估实例
[1] Wen X Z, Shao L, Xue Y, et al. A Rapid Learning Algorithm for Vehicle Classification[J]. Information Sciences, 2015,295(C):395-406.
doi: 10.1016/j.ins.2014.10.040
[2] Chen B J, Shu H Z, Coatrieux G, et al. Color Image Analysis by Quaternion-Type Moments[J]. Journal of Mathematical Imaging and Vision, 2015,51(1):124-144.
doi: 10.1007/s10851-014-0511-6
[3] Gu B, Sheng V S, Wang Z J, et al. Incremental Learning for ν-Support Vector Regression[J]. Neural Networks, 2015,67:140-150.
doi: 10.1016/j.neunet.2015.03.013
[4] 李慧, 柴亚青. 基于卷积神经网络的细粒度情感分析方法[J]. 数据分析与知识发现, 2019,3(1):95-103.
[4] ( Li Hui, Chai Yaqing. Fine-Grained Sentiment Analysis Based on Convolutional Neural Network[J]. Data Analysis and Knowledge Discovery, 2019,3(1):95-103.)
[5] Tang D Y, Qin B, Liu T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1422-1432.
[6] Wang S D, Manning C D. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 2012: 90-94.
[7] Song G, Ye Y M, Du X L, et al. Short Text Classification: A Survey[J]. Journal of Multimedia, 2014,9(5):635-643.
[8] 曾子明, 杨倩雯. 基于LDA和AdaBoost多特征组合的微博情感分析[J]. 数据分析与知识发现, 2018,2(8):51-59.
[8] ( Zeng Ziming, Yang Qianwen. Sentiment Analysis for Micro-blogs with LDA and AdaBoost[J]. Data Analysis and Knowledge Discovery, 2018,2(8):51-59.)
[9] Hu F, Li L, Zhang Z L, et al. Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks[J]. Journal of Computer Science and Technology, 2017,32(4):785-795.
doi: 10.1007/s11390-017-1759-2
[10] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010,21(8):1834-1848.
[10] ( Zhao Yanyan, Qin Bing, Liu Ting. Sentiment Analysis[J]. Journal of Software, 2010,21(8):1834-1848.)
[11] Tai K S, Socher R, Manning C D. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks[OL]. arXiv Preprint, arXiv:1503.00075v2.
[12] 吴鹏, 应杨, 沈思. 基于双向长短期记忆模型的网民负面情感分类研究[J]. 情报学报, 2018,37(8):845-853.
[12] ( Wu Peng, Ying Yang, Shen Si. Negative Emotions of Online Users’ Analysis Based on Bidirectional Long Short-Term Memory[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(8):845-853.)
[13] Li X, Bing L D, Lam W, et al. Transformation Networks for Target-Oriented Sentiment Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 946-956.
[14] 卢强, 朱振方, 徐富永, 等. 融合语法规则的Bi-LSTM中文情感分类方法研究[J]. 数据分析与知识发现, 2019,3(11):99-107.
[14] ( Lu Qiang, Zhu Zhenfang, Xu Fuyong, et al. Chinese Sentiment Classification Method with Bi-LSTM and Grammar Rules[J]. Data Analysis and Knowledge Discovery, 2019,3(11):99-107.)
[15] Hovy D. Demographic Factors Improve Classification Performance[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing. 2015: 752-762.
[16] Vosoughi S, Zhou H, Roy D, et al. Enhanced Twitter Sentiment Classification Using Contextual Information[C]// Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 2015: 16-24.
[17] 赵冬梅, 李雅, 陶建华, 等. 基于协同过滤Attention机制的情感分析模型[J]. 中文信息学报, 2018,32(8):128-134.
[17] ( Zhao Dongmei, Li Ya, Tao Jianhua, et al. Sentiment Analysis Based on Collaborative Filter Attention Mechanism[J]. Journal of Chinese Information Processing, 2018,32(8):128-134.)
[18] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
pmid: 9377276
[19] Wang K, Wan X J. SentiGAN: Generating Sentimental Texts via Mixture Adversarial Networks[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4446-4452.
[20] William F, Ian G, Andrew M D, et al. MaskGAN: Better Text Generation via Filling in the[C]// Proceedings of the 6th International Conference on Learning Representations. 2018: 1-17.
[1] 王昊, 林克柔, 孟镇, 李心蕾. 文本表示及其特征生成对法律判决书中多类型实体识别的影响分析[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[2] 赵丹宁,牟冬梅,白森. 基于深度学习的科技文献摘要结构要素自动抽取方法研究*[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[3] 谢豪,毛进,李纲. 基于多层语义融合的图文信息情感分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
[4] 胡昊天,吉晋锋,王东波,邓三鸿. 基于深度学习的食品安全事件实体一体化呈现平台构建*[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[5] 马建霞,袁慧,蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[6] 祁瑞华,简悦,郭旭,关菁华,杨明昕. 融合特征与注意力的跨领域产品评论情感分析*[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
[7] 颜靖华,侯苗苗. 基于LSTM网络的盗窃犯罪时间序列预测研究*[J]. 数据分析与知识发现, 2020, 4(11): 84-91.
[8] 马娜,张智雄,吴朋民. 基于特征融合的术语型引用对象自动识别方法研究*[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[9] 张庆庆,贺兴时,王慧敏,蒙胜军. 基于深度信念网络的文本情感分类研究*[J]. 数据分析与知识发现, 2019, 3(4): 71-79.
[10] 朱笑笑,杨尊琦,刘婧. 基于Bi-LSTM和CRF的药品不良反应抽取模型构建*[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[11] 陈美杉,夏晨曦. 肝癌患者在线提问的命名实体识别研究:一种基于迁移学习的方法 *[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[12] 卢强,朱振方,徐富永,国强强. 融合语法规则的Bi-LSTM中文情感分类方法研究 *[J]. 数据分析与知识发现, 2019, 3(11): 99-107.
[13] 余丽,钱力,付常雷,赵华茗. 基于深度学习的文本中细粒度知识元抽取方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[14] 李慧,柴亚青. 基于卷积神经网络的细粒度情感分析方法*[J]. 数据分析与知识发现, 2019, 3(1): 95-103.
[15] 冯国明, 张晓冬, 刘素辉. 基于自主学习的专业领域文本DBLC分词模型[J]. 数据分析与知识发现, 2018, 2(5): 40-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn