Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (10): 95-108    DOI: 10.11925/infotech.2096-3467.2022.0891
Current Issue | Archive | Adv Search |
Examining Topics and Sentiments of Chronic Disease Patients’ Online Reviews — Case Study of “Sweet Homeland”
Yu Jiaqi1,2,Zhao Doudou1,Liu Rui1()
1School of Information Management, Central China Normal University, Wuhan 430079, China
2Hubei Normal University Library, Huangshi 435002, China
Download: PDF (1156 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper constructs a model for topic-sentiment collaborative mining, aiming to understand chronic disease patients at different stages better. [Methods] First, we added sentiment and time features to the LDA model to create the new dUTSU (dynamic unsupervised topic and sentiment unification) model. Then, we retrieved posts by diabetes patients from an online health community. Finally, we assessed the dUTSU model’s performance with the topic-sentiment analysis and the topic-sentiment evolution analysis. [Results] The dUTSU model had better perplexity, average topic similarity, and sentiment classification accuracy than the JST, ASUM, and UTSU models. The model identified 15 topics and captured trending topics, sentiment, and intensity across seven distinct periods, including the disease diagnosis stage and the complication stage. The model also revealed the topic-sentiment evolution over time. [Limitations] The experiment only used the diabetics reviews. We did not consider patients’ geographical locations, personal attributes, and social relationships. [Conclusions] The dUTSU model could effectively extract topic-sentiment data collaboratively reviews from patients with chronic diseases. The findings can serve as valuable references for online health communities, medical institutions, and patients to carry out health services.

Key wordsOnline Health Community      Joint Topic-Sentiment Model      Evolution Analysis      Chronic Disease     
Received: 24 August 2022      Published: 28 March 2023
ZTFLH:  G350  
  R197  
  TP391  
Fund:National Social Science Fund of China(22&ZD324)
Corresponding Authors: Liu Rui,ORCID:0000-0002-5450-4947,E-mail:liuruiccnu@hotmail.com。   

Cite this article:

Yu Jiaqi, Zhao Doudou, Liu Rui. Examining Topics and Sentiments of Chronic Disease Patients’ Online Reviews — Case Study of “Sweet Homeland”. Data Analysis and Knowledge Discovery, 2023, 7(10): 95-108.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0891     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I10/95

Dynamic Unsupervised Topic and Sentiment Unification Model
Perplexity Index Results Under Different Topics
Classification Accuracy
主题标签 部分主题词 主题类目
1 糖尿病、引发、病因、问题、体质、原因、情绪、习惯、熬夜、进食 病因咨询
2 糖尿病、血糖、患者、知识、控糖、空腹、胰岛素、指南、必备、分享 知识共享
3 血糖、控制、运动、降糖、健康、跑步、坚持、减肥、加油、锻炼 血糖控制
4 危害、身体、并发症、影响、视力、病变、健康、低血糖、波动、肾病 疾病危害
5 手术、治疗、需要、切除、胃流转、愈合、减重、患者、后遗症、恢复 手术咨询
6 吃药、胰岛素、减量、二甲双胍、剂量、副作用、稳定、价格、区别、损害 药物服用
7 检查、指标、糖化、结果、分泌、报告、糖耐量、异常、生化、血红蛋白 疾病检查
8 医院、糖尿病、看病、医生、专家、治疗、预约、医保、检查、报销 就医指导
9 治疗、医生、用药、方案、药物、患者、临床、重要、研究、有效 治疗方案及研究
10 糖尿病、前期、多久、并发症、发展、确诊、时间、病情、最终、延缓 疾病发展阶段
11 糖尿病、并发、预防、高血压、检查、发现、提前、低血糖、缓解、早期 疾病预防
12 早餐、吃饭、主食、空腹、米饭、血糖、脂肪、晚饭、牛奶、加餐 饮食
13 神经病变、并发症、症状、视力、典型、模糊、糖尿病足、疼痛、导致、 疾病症状咨询
14 血糖仪、试纸、馒头、强生、对比、血糖值、标准、指数、达标、不错 血糖仪产品
15 糖尿病、先天、遗传、孩子、基因、家族、发病、亲属、怀孕、几率 疾病遗传
Topic Generation Results
Topic-Sentiment Probability Distribution
积极情感 消极情感
主题3 主题9 主题14 主题6 主题7 主题13
血糖 治疗 血糖仪 吃药 检查 神经病变
控制 医生 试纸 胰岛素 指标 并发症
运动 用药 馒头 减量 糖化 症状
降糖 方案 强生 二甲双胍 结果 视力
健康 药物 对比 剂量 分泌 典型
跑步 患者 血糖值 副作用 报告 模糊
坚持 临床 标准 稳定 糖耐量 糖尿病足
减肥 重要 指数 价格 异常 疼痛
加油 研究 达标 区别 生化 导致
锻炼 有效 不错 损害 血红蛋白 引起
<Topic,Sentiment> Pair of Words
Evolution of Positive Sentiment Intensity
High-Frequency Keywords Changes of Topic 3 over Time
Evolution of Negative Sentiment Intensity
High-Frequency Keywords Changes of Topic 13 over Time
主题

时间片
1 2 3 4 5 6 7
(+) 血糖控制 1.84 2.27 2.04 1.71 1.65 1.57 1.72
治疗方案及研究 1.14 1.39 1.46 1.30 1.51 1.64 2.06
血糖仪产品 1.39 1.06 1.16 1.22 1.07 1.00 1.11
(-) 药物服用 1.47 1.60 1.32 1.39 1.32 1.59 1.71
疾病检查 1.62 1.12 1.00 1.59 1.62 1.03 1.00
症状咨询 1.81 1.26 1.40 1.78 2.07 1.93 1.76
Sentiment Intensity Normalization
[1] 百度健康. 百度健康搜索大数据:“睡眠”问题成2020年职场人士最普遍的困扰[EB/OL]. [2021-01-15]. https://baijiahao.baidu.com/s?id=1688925168396722789&wfr=spider&for=pc.
[1] (Baidu Health. Baidu Health Search Big Data: “Sleep” Problem Has Become the Most Common Problem for Professionals in 2020[EB/OL]. [2021-01-15]. https://baijiahao.baidu.com/s?id=1688925168396722789&wfr=spider&for=pc.)
[2] Gupta T, Schapira L. Online Communities as Sources of Peer Support for People Living with Cancer: A Commentary[J]. Journal of Oncology Practice. DOI: 10.1200/JOP.18.00261.
[3] Litchman M L, Edelman L S. Perceptions of the Diabetes Online Community's Credibility, Social Capital, and Help and Harm: Cross-Sectional Comparison Between Baby Boomers and Younger Adults[J]. JMIR Aging, 2019, 2(2): Article No.e10857.
[4] 高炬, 曾庆枝, 何燕玲, 等. 上海市2012年社区在册糖尿病和高血压患者抑郁、焦虑阳性率及其影响因素[J]. 中国公共卫生, 2018, 34(2): 223-229.
[4] (Gao Ju, Zeng Qingzhi, He Yanling, et al. Positive Rate and Associated Factors of Anxiety and Depressive Symptoms Among Community-Dwelling Hypertension and Diabetes Patients[J]. Chinese Journal of Public Health, 2018, 34(2): 223-229.)
[5] 董伟, 陶金虎. 融合PageRank与评论情感倾向的在线健康社区用户影响力研究[J]. 图书情报工作, 2021, 65(11): 14-23.
doi: 10.13266/j.issn.0252-3116.2021.11.002
[5] (Dong Wei,Tao Jinhu. Research on the User’s Influence in Online Health Community Based on PageRank and Emotional Tendency[J]. Library and Information Service, 2021, 65(11): 14-23.)
doi: 10.13266/j.issn.0252-3116.2021.11.002
[6] Savolainen R. Emotions as Motivators for Information Seeking: A Conceptual Analysis[J]. Library & Information Science Research, 2014, 36(1): 59-65.
[7] Akay A, Dragomir A, Erlandsson B E. Network-Based Modeling and Intelligent Data Mining of Social Media for Improving Care[J]. IEEE Journal of Biomedical and Health Informatics, 2015, 19(1): 210-218.
doi: 10.1109/JBHI.2014.2336251 pmid: 25029520
[8] Sung R J, Chiu C, Chiu N H, et al. Online Detection of Concerned HIV-Related Messages in Web Forums[J]. AIDS Care, 2014, 26(3): 337-342.
doi: 10.1080/09540121.2013.819408 pmid: 23876022
[9] Deerwester S, Dumais S T, Furnas G W, et al. Indexing by Latent Semantic Analysis[J]. Journal of the American Society for Information Science, 1990, 41(6): 391-407.
doi: 10.1002/(ISSN)1097-4571
[10] Hofmann T. Unsupervised Learning by Probabilistic Latent Semantic Analysis[J]. Machine Learning, 2001, 42(1): 177-196.
doi: 10.1023/A:1007617005950
[11] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[12] 尹德虎. 在线健康社区中基于LDA模型的话题热度动态演化趋势研究[D]. 昆明: 昆明理工大学, 2019.
[12] (Yin Dehu. Research on Dynamic Evolution Trend of Topic Hotness Based on LDA Model in Online Health Community[D]. Kunming: Kunming University of Science and Technology, 2019.)
[13] 李重阳, 翟姗姗, 郑路. 网络健康社区信息需求特征测度——基于时间和主题视角的实证分析[J]. 数字图书馆论坛, 2016(9): 34-42.
[13] (Li Chongyang, Zhai Shanshan, Zheng Lu. Measurement of Information Demand Characteristics in Online Health Community: An Empirical Analysis Based on Time and Theme Perspective[J]. Digital Library Forum, 2016(9): 34-42.)
[14] 安欣宇, 陈育新, 张晗. 网络健康社区用户的新冠肺炎信息需求[J]. 中华医学图书情报杂志, 2021, 30(2): 53-58.
[14] (An Xinyu, Chen Yuxin, Zhang Han. Need of Online Health Community Users for Information on COVID-19 Infection Pneumonia[J]. Chinese Journal of Medical Library and Information Science, 2021, 30(2): 53-58.)
[15] Dieng A B, Ruiz F J R, Blei D M. Topic Modeling in Embedding Spaces[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 439-453.
doi: 10.1162/tacl_a_00325
[16] Titov I, McDonald R. Modeling Online Reviews with Multi-Grain Topic Models[C]// Proceedings of the 17th International Conference on World Wide Web. ACM, 2008: 111-120.
[17] Angelov D. Top2Vec: Distributed Representations of Topics[OL]. arXiv Preprint, arXiv: 2008.09470.
[18] Ge J W, Lin S C, Fang Y Q. A Text Classification Algorithm Based on Topic Model and Convolutional Neural Network[J]. Journal of Physics: Conference Series, 2021, 1748(3): 32-36.
[19] 钟佳娃, 刘巍, 王思丽, 等. 文本情感分析方法及应用综述[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[19] (Zhong Jiawa, Liu Wei, Wang Sili, et al. Review of Methods and Applications of Text Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 1-13.)
[20] Feldman R. Techniques and Applications for Sentiment Analysis[J]. Communications of the ACM, 2013, 56(4): 82-89.
[21] 白刚. 基于语义与情感词典的微博评论情感分析方法[J]. 现代计算机, 2021, 27(30): 55-58.
[21] (Bai Gang. Sentiment Analysis of Microblog Comments Based on Semantic and Sentiment Dictionary[J]. Modern Computer, 2021, 27(30): 55-58.)
[22] Pang B, Lee L, Vaithyanathan S. Thumbs Up? : Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. ACM, 2002: 79-86.
[23] Wang H, Liu B, Li C Z, et al. Learning with Noisy Labels for Sentence-Level Sentiment Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: ACL, 2019: 6286-6292.
[24] 周泳东, 章韵, 曹艳蓉, 等. 基于特征融合分段卷积神经网络的情感分析[J]. 计算机工程与设计, 2019, 40(10): 3009-3013.
[24] (Zhou Yongdong, Zhang Yun, Cao Yanrong, et al. Sentiment Analysis Based on Piecewise Convolutional Neural Network Combined with Features[J]. Computer Engineering and Design, 2019, 40(10): 3009-3013.)
[25] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
doi: 10.1162/neco.1997.9.8.1735 pmid: 9377276
[26] Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005, 18(5-6): 602-610.
doi: 10.1016/j.neunet.2005.06.042 pmid: 16112549
[27] Zhao Z H, Hao Z H, Wang G C, et al. Sentiment Analysis of Review Data Using Blockchain and LSTM to Improve Regulation for a Sustainable Market[J]. Journal of Theoretical and Applied Electronic Commerce Research, 2021, 17(1): 1-19.
doi: 10.3390/jtaer17010001
[28] 罗浩然, 杨青. 基于情感词典和堆叠残差的双向长短期记忆网络的情感分析[J]. 计算机应用, 2022, 42(4): 1099-1107.
doi: 10.11772/j.issn.1001-9081.2021071179
[28] (Luo Haoran, Yang Qing. Sentiment Analysis Based on Sentiment Lexicon and Stacked Residual Bi-LSTM Network[J]. Journal of Computer Applications, 2022, 42(4): 1099-1107.)
doi: 10.11772/j.issn.1001-9081.2021071179
[29] 刘冰, 历鑫, 张赫钊, 等. 网络健康社区中身份转换期女性信息需求主题特征及情感因素研究——以“妈妈网”中“备孕版块”为例[J]. 情报理论与实践, 2019, 42(5): 87-92.
[29] (Liu Bing, Li Xin, Zhang Hezhao, et al. Thematic Characteristics and Emotional Factors of Women’s Information Needs During Their Identity Transition Period in the Online Health Community: A Case Study of the “Pregnant Section” in “Mama.cn”[J]. Information Studies: Theory & Application, 2019, 42(5): 87-92.)
[30] 王晰巍, 李玥琪, 刘婷艳, 等. 新冠肺炎疫情微博用户情感与主题挖掘的协同模型研究[J]. 情报学报, 2021, 40(3): 223-233.
[30] (Wang Xiwei, Li Yueqi, Liu Tingyan, et al. Research on the Collaborative Model of Sentiment Analysis and Topic Mining of Micro-Blogging Users in the Context of COVID-19[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(3): 223-233.)
[31] Fu X H, Liu G, Guo Y Y, et al. Multi-Aspect Blog Sentiment Analysis Based on LDA Topic Model and HowNet Lexicon[C]// Proceedings of the 2011 International Conference on Web Information Systems and Mining. Berlin, Heidelberg: Springer, 2011: 131-138.
[32] Pathak A R, Pandey M, Rautaray S. Topic-Level Sentiment Analysis of Social Media Data Using Deep Learning[J]. Applied Soft Computing, 2021, 108: Article No.107440.
[33] Mishra R K, Urolagin S, Jothi J A A, et al. Deep Learning-Based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic[J]. Frontiers in Computer Science, 2021, 3: Article No.775368.
[34] Mei Q Z, Ling X, Wondra M, et al. Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs[C]// Proceedings of the 16th International Conference on World Wide Web. ACM, 2007: 171-180.
[35] Lin C H, He Y L. Joint Sentiment/Topic Model for Sentiment Analysis[C]// Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, 2009: 375-384.
[36] Lin C H, He Y L, Everson R, et al. Weakly Supervised Joint Sentiment-Topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
doi: 10.1109/TKDE.2011.48
[37] Dermouche M, Kouas L, Velcin J, et al. A Joint Model for Topic-Sentiment Modeling from Text[C]// Proceedings of the 30th Annual ACM Symposium on Applied Computing. ACM, 2015: 819-824.
[38] Jo Y, Oh A H. Aspect and Sentiment Unification Model for Online Review Analysis[C]// Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 2011: 815-824.
[39] 孙艳, 周学广, 付伟. 无监督的主题情感混合模型研究[J]. 西安交通大学学报, 2013, 47(1): 120-125.
[39] (Sun Yan, Zhou Xueguang, Fu Wei. An Unsupervised Topic and Sentiment Unification Model[J]. Journal of Xi’an Jiaotong University, 2013, 47(1): 120-125.)
[40] Kim S, Zhang J W, Chen Z, et al. A Hierarchical Aspect-Sentiment Model for Online Reviews[C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. ACM, 2013: 526-533.
[41] Roberts M E, Stewart B M, Tingley D, et al. The Structural Topic Model and Applied Social Science[C]// Proceedings of the NIPS 2013 Workshop on Topic Models: Computation, Application, and Evaluation. 2013: 1-4.
[42] Roberts M E, Stewart B M, Tingley D. STM: An R Package for Structural Topic Models[J]. Journal of Statistical Software, 2019, 91(2): 1-40.
[43] Bai X W, Zhang X N, Li K X, et al. Research Topics and Trends in the Maritime Transport: A Structural Topic Model[J]. Transport Policy, 2021, 102: 11-24.
doi: 10.1016/j.tranpol.2020.12.013
[44] Park K B, Ha S H. Customer Service Evaluation Based on Online Text Analytics: Sentiment Analysis and Structural Topic Modeling[J]. The Journal of Information Systems, 2017, 26(4): 327-353.
[45] Lee K R, Kim B, Nan D Y, et al. Structural Topic Model Analysis of Mask-Wearing Issue Using International News Big Data[J]. International Journal of Environmental Research and Public Health, 2021, 18(12): Article No.6432.
[46] Boukobza A, Burgun A, Roudier B, et al. Deep Neural Networks for Simultaneously Capturing Public Topics and Sentiments During a Pandemic: Application on a COVID-19 Tweet Data Set[J]. JMIR Medical Informatics, 2022, 10(5): Article No.e34306.
[47] Alhuzali H, Zhang T L, Ananiadou S. Emotions and Topics Expressed on Twitter During the COVID-19 Pandemic in the United Kingdom: Comparative Geolocation and Text Mining Analysis[J]. Journal of Medical Internet Research, 2022, 24(10): Article No.e40323.
[48] 程文婷, 吴家辉. 学前儿童数字阅读用户意见主题挖掘与情感分析研究[J]. 图书馆建设, 2022(3): 104-112.
[48] (Cheng Wenting, Wu Jiahui. Topic Model and Sentiment Analysis of User Opinions on Preschool Children’s Digital Reading[J]. Library Development, 2022(3): 104-112.)
[49] 赵常煜, 吴亚平, 王继民. “一带一路”倡议下的Twitter文本主题挖掘和情感分析[J]. 图书情报工作, 2019, 63(19): 119-127.
doi: 10.13266/j.issn.0252-3116.2019.19.012
[49] (Zhao Changyu, Wu Yaping, Wang Jimin. Twitter Text Topic Mining and Sentiment Analysis Under the Belt and Road Initiative[J]. Library and Information Service, 2019, 63(19): 119-127.)
doi: 10.13266/j.issn.0252-3116.2019.19.012
[50] Dermouche M, Velcin J, Khouas L, et al. A Joint Model for Topic-Sentiment Evolution over Time[C]// Proceedings of the 2014 IEEE International Conference on Data Mining. IEEE, 2014: 773-778.
[51] Grootendorst M. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure[OL]. arXiv Preprint, arXiv: 2203.05794.
[52] Dieng A B, Ruiz F J R, Blei D M. The Dynamic Embedded Topic Model[OL]. arXiv Preprint, arXiv: 1907.05545.
[53] 李超雄, 黄发良, 温肖谦, 等. 基于动态主题情感混合模型的微博主题情感演化分析方法[J]. 计算机应用, 2015, 35(10): 2905-2910.
doi: 10.11772/j.issn.1001-9081.2015.10.2905
[53] (Li Chaoxiong, Huang Faliang, Wen Xiaoqian, et al. Evolution Analysis Method of Microblog Topic-Sentiment Based on Dynamic Topic Sentiment Combining Model[J]. Journal of Computer Applications, 2015, 35(10): 2905-2910.)
doi: 10.11772/j.issn.1001-9081.2015.10.2905
[54] 程惠华. 基于概率图模型的社交媒体情感分析[D]. 福州: 福建师范大学, 2018.
[54] (Cheng Huihua. Sentiment Analysis of Social Media Based on Probability Graph Model[D]. Fuzhou: Fujian Normal University, 2018.)
[55] 陈毓蔚. 网络舆情热点事件的情感倾向分析与演变过程研究[D]. 杭州: 浙江工商大学, 2018.
[55] (Chen Yuwei. Research on Sentiment Analysis and Evolution Process for Hot Events of Network Public Opinion[D]. Hangzhou: Zhejiang Gongshang University, 2018.)
[56] Röder M, Both A, Hinneburg A. Exploring the Space of Topic Coherence Measures[C]// Proceedings of the 8th ACM International Conference on Web Search and Data Mining. ACM, 2015: 399-408.
[1] Liu Zhenghao, Zhang Zhijian, Chen Shuaipu, Zeng Xi. Modelling and Representation of Risk Event Evolution in Financial Field[J]. 数据分析与知识发现, 2023, 7(8): 78-94.
[2] Guo Jinjing, Xia Guanghui, Huang Qi, He Liyun, Zhang Huabing. Detecting Signals of Adverse Drug Reactions with Data from Online Health Community[J]. 数据分析与知识发现, 2022, 6(7): 70-86.
[3] Lv Lucheng, Zhou Jian, Wang Xuezhao, Liu Xiwen. Technology Evolution Analysis Framework Based on Two-Layer Topic Model and Application[J]. 数据分析与知识发现, 2022, 6(2/3): 18-32.
[4] Ye Guanghui,Xu Tong. Dynamic City Profile Based on Evolutionary Analysis[J]. 数据分析与知识发现, 2020, 4(9): 100-110.
[5] Li He,Liu Jiayu,Shen Wang,Liu Rui,Jin Shuaiqi. Recommending Knowledge for Online Health Community Users Based on Fuzzy Cognitive Map[J]. 数据分析与知识发现, 2020, 4(12): 55-67.
[6] Ye Guanghui,Xu Tong,Bi Chongwu,Li Xinyue. Analyzing Evolution of City Tourism Portraits with Multi-Dimensional Features and LDA Model[J]. 数据分析与知识发现, 2020, 4(11): 121-130.
[7] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[8] Lei Yang,Zirun Wang,Guisheng Hou. Discovering Topics of Online Health Community with Q-LDA Model[J]. 数据分析与知识发现, 2019, 3(11): 52-59.
[9] Qu Jiabin,Ou Shiyan. Analyzing Topic Evolution with Topic Filtering and Relevance[J]. 数据分析与知识发现, 2018, 2(1): 64-75.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn