Please wait a minute...
Data Analysis and Knowledge Discovery  2024, Vol. 8 Issue (3): 53-62    DOI: 10.11925/infotech.2096-3467.2023.0080
Current Issue | Archive | Adv Search |
Sentiment Analysis of Online Health Community Based on Emotional Enhancement and Knowledge Fusion
Zhang Wei1,Xu Zonghuang1,Cai Hongyu1,Han Pu2,3(),Shi Jin1
1School of Information Management, Nanjing University, Nanjing 210023, China
2School of Management, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
3Jiangsu Provincial Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
Download: PDF (1161 KB)   HTML ( 10
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study conducts sentiment analysis using the emotional knowledge contained in the syntactic structures of texts from online health communities. We propose an online health community sentiment analysis model, WoBEK-GAT, based on emotional enhancement and knowledge fusion. [Methods] Firstly, we utilized WoBERT Plus for dynamic word embedding. Then, we extracted semantic features using CNN and BiLSTM. Finally, we fully integrated key syntactic information from pruned dependency trees with external emotional knowledge through sentiment enhancement and knowledge fusion strategies. We fed these inputs into the GAT to output sentiment categories. [Results] We conducted comparative experiments on a constructed Chinese dataset. The proposed model’s MacroF1 value reached 88.48%. It was 15.49%, 14.15%, and 13.15% over baseline models CNN, BiLSTM, and GAT, respectively. [Limitations] We should have considered sentiment knowledge in multimodal information such as pictures and speeches. [Conclusions] The proposed model could effectively improve sentiment analysis capability.

Key wordsOnline      Health      Community      Sentiment      Analysis      Emotional      Enhancement      Knowledge      Fusion      GAT     
Received: 09 February 2023      Published: 12 September 2023
ZTFLH:  G350  
Fund:National Social Science Fund of China(21BTQ012);National Social Science Fund of China(22BTQ096)
Corresponding Authors: Han Pu,ORCID:0000-0001-5867-4292,E-mail:hanpu@njupt.edu.cn。   

Cite this article:

Zhang Wei, Xu Zonghuang, Cai Hongyu, Han Pu, Shi Jin. Sentiment Analysis of Online Health Community Based on Emotional Enhancement and Knowledge Fusion. Data Analysis and Knowledge Discovery, 2024, 8(3): 53-62.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2023.0080     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2024/V8/I3/53

WoBEK-GAT Model
Construction Process of EK-GAT
数据集 类别 数量 总计
在线健康社区语料 积极 23 315 69 724
中性 23 237
消极 23 172
Online Health Community Data Labeling Statistics
实验参数名称 说明 参数值
Max Length of Sentences 最大文本序列长度 150
Size of Word Vector 词向量的维度 768
Batch Size 每批数据量的大小 16
Numbers of Feature Map CNN卷积核个数 200
Hidden Size of BiLSTM BiLSTM隐藏层大小 256
Numbers of Multi-Head Attention 多头注意力机制头数 4
Negative Input Slope of LeakyReLU LeakyReLU负输入斜率 0.2
Best Degree of Knowledge Fusion 最优知识融合度 0.3
Epochs 样本训练次数 10
Different Learning Rate 差分学习速率 0.001(其他),2e-5(WoBERT Plus)
Parameter Setting
模型 MacroP /% MacroR /% MacroF1 /%
传统神经网络 CNN 73.15 72.83 72.99
RNN 72.79 72.28 72.53
LSTM 74.41 73.03 73.71
BiLSTM 74.15 74.51 74.33
图神经网络 GCN 74.26 74.95 74.60
GCN-P 74.83 75.00 74.91
GAT 75.38 75.29 75.33
GAT-P 76.04 75.80 75.92
Experimental Results of Six Benchmark Models
模型 MacroP /% MacroR /% MacroF1 /%
W2V-CNN 75.12 75.88 75.50
W2V- BiLSTM 76.27 77.07 76.67
W2V-CNN-BiLSTM 77.59 78.01 77.80
WoB-CNN 79.81 80.24 80.02
WoB-BiLSTM 82.04 82.71 82.37
WoB-CNN-BiLSTM 82.93 83.29 83.11
Experimental Results of Different Feature Extraction Methods
模型 MacroP /% MacroR /% MacroF1 /%
WoB-CNN-BiLSTM-GAT-P 84.68 84.92 84.80
WoB-CNN-BiLSTM-EGAT1 85.83 86.47 86.15
WoB-CNN-BiLSTM-EGAT2 84.87 85.39 85.13
WoB-CNN-BiLSTM-EGAT3 86.76 87.25 87.00
Experimental Results of Emotional Enhancement
知识融合度 MacroP/% MacroR/% MacroF1/%
K=0 84.68 84.92 84.80
K=0.1 84.91 85.13 85.02
K=0.2 85.20 85.61 85.40
K=0.3 85.49 85.93 85.71
K=0.4 85.37 85.73 85.55
K=0.5 85.35 85.51 85.43
K=0.6 84.22 84.69 84.45
K=0.7 83.83 84.07 83.95
K=0.8 82.21 82.62 82.41
K=0.9 78.24 78.95 78.59
K=1.0 67.75 69.01 68.37
Experimental Results of Knowledge Fusion
层数 MacroP/% MacroR/% MacroF1/%
N=1 87.15 87.64 87.39
N=2 88.25 88.72 88.48
N=3 73.63 74.35 73.99
N=4 53.14 54.01 53.57
N=5 32.75 34.12 33.42
Experimental Results of WoBEK-GAT Model with Different Number of Layers
[1] 吴江, 李姗姗, 周露莎, 等. 基于随机行动者模型的在线医疗社区用户关系网络动态演化研究[J]. 情报学报, 2017, 36(2): 213-220.
[1] (Wu Jiang, Li Shanshan, Zhou Lusha, et al. Research on Dynamic Evolution of Users’ Relationship Network in Online Health Community Based on Stochastic Actor-Oriented Model[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(2): 213-220.)
[2] Rodrigues R G, das Dores R M, Camilo-Junior C G, et al. SentiHealth-Cancer: A Sentiment Analysis Tool to Help Detecting Mood of Patients in Online Social Networks[J]. International Journal of Medical Informatics, 2016, 85(1): 80-95.
doi: 10.1016/j.ijmedinf.2015.09.007 pmid: 26514078
[3] Zhao K, Yen J, Greer G, et al. Finding Influential Users of Online Health Communities: A New Metric Based on Sentiment Influence[J]. Journal of the American Medical Informatics Association, 2014, 21(e2): e212-e218.
doi: 10.1136/amiajnl-2013-002282
[4] Ali T, Schramm D, Sokolova M, et al. Can I Hear You? Sentiment Analysis on Medical Forums[C]// Proceedings of the 6th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2013: 667-673.
[5] 郭凤仪, 纪雪梅. 突发公共卫生事件下在线健康社区突发话题与情感的共现关联分析[J]. 情报理论与实践, 2022, 45(4): 190-198.
[5] (Guo Fengyi, Ji Xuemei. Co-Occurrence and Correlation Analysis of Emergent Topics and Emotions in Online Health Communities under Public Health Emergencies[J]. Information Studies: Theory & Application, 2022, 45(4): 190-198.)
[6] 刘冰, 历鑫, 张赫钊, 等. 网络健康社区中身份转换期女性信息需求主题特征及情感因素研究——以“妈妈网”中“备孕版块”为例[J]. 情报理论与实践, 2019, 42(5): 87-92.
[6] (Liu Bing, Li Xin, Zhang Hezhao, et al. Thematic Characteristics and Emotional Factors of Women’s Information Needs During Their Identity Transition Period in the Online Health Community: A Case Study of the “Pregnant Section” in “Mama.cn”[J]. Information Studies: Theory & Application, 2019, 42(5): 87-92.)
[7] 叶艳, 吴鹏, 周知, 等. 基于LDA-BiLSTM模型的在线医疗服务质量识别研究[J]. 情报理论与实践, 2022, 45(8): 178-183, 168.
[7] (Ye Yan, Wu Peng, Zhou Zhi, et al. Research on Online Medical Service Quality Identification Based on LDA-BiLSTM Model[J]. Information Studies: Theory & Application, 2022, 45(8): 178-183, 168.)
[8] Chen T, Xu R F, He Y L, et al. Improving Sentiment Analysis via Sentence Type Classification Using BiLSTM-CRF and CNN[J]. Expert Systems with Applications, 2017, 72: 221-230.
doi: 10.1016/j.eswa.2016.10.065
[9] Liang B, Su H, Gui L, et al. Aspect-Based Sentiment Analysis via Affective Knowledge Enhanced Graph Convolutional Networks[J]. Knowledge-Based Systems, 2022, 235: 107643.
doi: 10.1016/j.knosys.2021.107643
[10] Zhou J, Huang J X, Hu Q V, et al. SK-GCN: Modeling Syntax and Knowledge via Graph Convolutional Network for Aspect-Level Sentiment Classification[J]. Knowledge-Based Systems, 2020, 205: 106292.
doi: 10.1016/j.knosys.2020.106292
[11] Lai Y N, Zhang L F, Han D H, et al. Fine-Grained Emotion Classification of Chinese Microblogs Based on Graph Convolution Networks[J]. World Wide Web, 2020, 23(5): 2771-2787.
doi: 10.1007/s11280-020-00803-0
[12] Zhu X F, Zhu L, Guo J F, et al. GL-GCN: Global and Local Dependency Guided Graph Convolutional Networks for Aspect-Based Sentiment Classification[J]. Expert Systems with Applications, 2021, 186: 115712.
doi: 10.1016/j.eswa.2021.115712
[13] Zeng J D, Liu T Y, Jia W J, et al. Fine-Grained Question-Answer Sentiment Classification with Hierarchical Graph Attention Network[J]. Neurocomputing, 2021, 457: 214-224.
doi: 10.1016/j.neucom.2021.06.040
[14] 范涛, 王昊, 吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[14] (Fan Tao, Wang Hao, Wu Peng. Sentiment Analysis of Online Users’ Negative Emotions Based on Graph Convolutional Network and Dependency Parsing[J]. Data Analysis and Knowledge Discovery, 2021, 5(9): 97-106.)
[15] 以词为基本单位的中文BERT[EB/OL]. [2021-11-18]. https://github.com/ZhuiyiTechnology/WoBERT.
[15] (Chinese BERT with Word as Basic Unit[EB/OL]. [2021-11-18]. https://github.com/ZhuiyiTechnology/WoBERT.)
[16] Kim Y. Convolutional Neural Networks for Sentence Classification[OL].arXiv Preprint, arXiv:1408.5882.
[17] Schuster M, Paliwal K K. Bidirectional Recurrent Neural Networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
doi: 10.1109/78.650093
[18] Che W X, Li Z H, Liu T. LTP: A Chinese Language Technology Platform[C]// Proceedings of the 23rd International Conference on Computational Linguistics:Demonstrations. New York: ACM Press, 2010: 13-16.
[19] Bruna J, Zaremba W, Szlam A, et al. Spectral Networks and Locally Connected Networks on Graphs[OL]. arXiv Preprint, arXiv: 1312.6203.
[20] Pang S G, Xue Y, Yan Z H, et al. Dynamic and Multi-Channel Graph Convolutional Networks for Aspect-Based Sentiment Analysis[C]// Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021. Stroudsburg: Association for Computational Linguistics, 2021: 2627-2636.
[21] 娄岩, 杨嘉林, 黄鲁成, 等. 基于网络问答社区的老年科技公众关注热点及情感分析——以“知乎”为例[J]. 情报杂志, 2020, 39(3): 115-122.
[21] (Lou Yan, Yang Jialin, Huang Lucheng, et al. Analysis of Public Concerns and Emotions of Gerontechnology Based on Social Q&A Community—Taking “Zhihu” as an Example[J]. Journal of Intelligence, 2020, 39(3): 115-122.)
[22] Zhang X, Zhao J B, LeCun Y. Character-Level Convolutional Networks for Text Classification[C]// Proceedings of the 29th Annual Conference on Neural Information Processing Systems. New York: ACM Press, 2015: 649-657.
[23] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st Annual Conference on Neural Information Processing Systems. New York: ACM Press, 2017: 5998-6008.
[24] 蔡莉, 王淑婷, 刘俊晖, 等. 数据标注研究综述[J]. 软件学报, 2020, 31(2): 302-320.
[24] (Cai Li, Wang Shuting, Liu Junhui, et al. Survey of Data Annotation[J]. Journal of Software, 2020, 31(2): 302-320.)
[25] 王昊, 龚丽娟, 周泽聿, 等. 融合语义增强的社交媒体虚假信息检测方法研究[J]. 数据分析与知识发现, 2023, 7(2): 48-60.
[25] (Wang Hao, Gong Lijuan, Zhou Zeyu, et al. Detecting Mis/Dis-Information from Social Media with Semantic Enhancement[J]. Data Analysis and Knowledge Discovery, 2023, 7(2): 48-60.)
[1] Nie Hui, Wu Xiaoyan. Detecting Depression Factors with Gradient Boosting Tree and Explainable Machine Learning Model SHAP[J]. 数据分析与知识发现, 2024, 8(3): 41-52.
[2] Quan Ankun, Li Honglian, Zhang Le, Lyu Xueqiang. Generating Chinese Abstracts with Content and Image Features[J]. 数据分析与知识发现, 2024, 8(3): 110-119.
[3] Huang Taifeng, Ma Jing. Text Sentiment Classification Algorithm Based on Prompt Learning Enhancement[J]. 数据分析与知识发现, 2024, 8(3): 77-84.
[4] Zhang Xiaolin. AI-Empowered Policy for Science & Technology Decision Intelligence—Developing New Quality Productive Forces for Knowledge Services[J]. 数据分析与知识发现, 2024, 8(3): 1-9.
[5] Wu Yue, Sun Haichun. An Overview of Research on Knowledge Graph Completion Based on Graph Neural Network[J]. 数据分析与知识发现, 2024, 8(3): 10-28.
[6] Zhang Zhijian, Xia Sudi, Liu Zhenghao. Seal Recognition and Application Based on Multi-feature Fusion Deep Learning[J]. 数据分析与知识发现, 2024, 8(3): 143-155.
[7] Gu Yan, Zheng Kaihong, Hu Yongjun, Song Yishan, Liu Dongping. Support for Cross-Domain Methods of Identifying Fake Comments of Chinese[J]. 数据分析与知识发现, 2024, 8(2): 84-98.
[8] Liang Shuang, Liu Xiaoping, Chai Wenyue. Identifying Important Topics and Knowledge Flow Paths with Topic-Citation Fusion[J]. 数据分析与知识发现, 2024, 8(2): 99-113.
[9] Li Meiyu, Liu Yang, Wang Yixuan, Zhu Qinghua. Predicting User Pay Conversion Intention Based on Stacking Ensemble Learning: Case Study of Free Value-Added Games[J]. 数据分析与知识发现, 2024, 8(2): 143-154.
[10] Yang Xinyi, Ma Haiyun, Zhu Hengmin. Identifying Trending Events Based on Time Series Anomaly Detection[J]. 数据分析与知识发现, 2024, 8(2): 131-142.
[11] Li Xuelian, Wang Bi, Li Lixin, Han Dixuan. Sentiment Analysis with Abstract Meaning Representation and Dependency Grammar[J]. 数据分析与知识发现, 2024, 8(1): 55-68.
[12] He Yu, Zhang Xiaodong, Zheng Xin. Constructing Patent Knowledge Graph with SpERT-Aggcn Model[J]. 数据分析与知识发现, 2024, 8(1): 146-156.
[13] Li Hui, Hu Yaohua, Xu Cunzhen. Personalized Recommendation Algorithm with Review Sentiments and Importance[J]. 数据分析与知识发现, 2024, 8(1): 69-79.
[14] Li Xuesi, Zhang Zhixiong, Wang Yufei, Liu Yi. A Review on Methods for Domain Knowledge Evolution Analysis[J]. 数据分析与知识发现, 2024, 8(1): 1-15.
[15] Liu Tong, Ren Xinru, Yin Jinhui, Ni Weijian. Knowledge Distillation with Few Labeled Samples[J]. 数据分析与知识发现, 2024, 8(1): 104-113.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn