[Objective] This paper constructs a model for topic-sentiment collaborative mining, aiming to understand chronic disease patients at different stages better. [Methods] First, we added sentiment and time features to the LDA model to create the new dUTSU (dynamic unsupervised topic and sentiment unification) model. Then, we retrieved posts by diabetes patients from an online health community. Finally, we assessed the dUTSU model’s performance with the topic-sentiment analysis and the topic-sentiment evolution analysis. [Results] The dUTSU model had better perplexity, average topic similarity, and sentiment classification accuracy than the JST, ASUM, and UTSU models. The model identified 15 topics and captured trending topics, sentiment, and intensity across seven distinct periods, including the disease diagnosis stage and the complication stage. The model also revealed the topic-sentiment evolution over time. [Limitations] The experiment only used the diabetics reviews. We did not consider patients’ geographical locations, personal attributes, and social relationships. [Conclusions] The dUTSU model could effectively extract topic-sentiment data collaboratively reviews from patients with chronic diseases. The findings can serve as valuable references for online health communities, medical institutions, and patients to carry out health services.
余佳琪, 赵豆豆, 刘蕤. 在线健康社区慢性病患者评论主题情感协同挖掘研究——以甜蜜家园为例*[J]. 数据分析与知识发现, 2023, 7(10): 95-108.
Yu Jiaqi, Zhao Doudou, Liu Rui. Examining Topics and Sentiments of Chronic Disease Patients’ Online Reviews — Case Study of “Sweet Homeland”. Data Analysis and Knowledge Discovery, 2023, 7(10): 95-108.
(Baidu Health. Baidu Health Search Big Data: “Sleep” Problem Has Become the Most Common Problem for Professionals in 2020[EB/OL]. [2021-01-15]. https://baijiahao.baidu.com/s?id=1688925168396722789&wfr=spider&for=pc.)
[2]
Gupta T, Schapira L. Online Communities as Sources of Peer Support for People Living with Cancer: A Commentary[J]. Journal of Oncology Practice. DOI: 10.1200/JOP.18.00261.
[3]
Litchman M L, Edelman L S. Perceptions of the Diabetes Online Community's Credibility, Social Capital, and Help and Harm: Cross-Sectional Comparison Between Baby Boomers and Younger Adults[J]. JMIR Aging, 2019, 2(2): Article No.e10857.
(Gao Ju, Zeng Qingzhi, He Yanling, et al. Positive Rate and Associated Factors of Anxiety and Depressive Symptoms Among Community-Dwelling Hypertension and Diabetes Patients[J]. Chinese Journal of Public Health, 2018, 34(2): 223-229.)
(Dong Wei,Tao Jinhu. Research on the User’s Influence in Online Health Community Based on PageRank and Emotional Tendency[J]. Library and Information Service, 2021, 65(11): 14-23.)
doi: 10.13266/j.issn.0252-3116.2021.11.002
[6]
Savolainen R. Emotions as Motivators for Information Seeking: A Conceptual Analysis[J]. Library & Information Science Research, 2014, 36(1): 59-65.
[7]
Akay A, Dragomir A, Erlandsson B E. Network-Based Modeling and Intelligent Data Mining of Social Media for Improving Care[J]. IEEE Journal of Biomedical and Health Informatics, 2015, 19(1): 210-218.
doi: 10.1109/JBHI.2014.2336251
pmid: 25029520
[8]
Sung R J, Chiu C, Chiu N H, et al. Online Detection of Concerned HIV-Related Messages in Web Forums[J]. AIDS Care, 2014, 26(3): 337-342.
doi: 10.1080/09540121.2013.819408
pmid: 23876022
[9]
Deerwester S, Dumais S T, Furnas G W, et al. Indexing by Latent Semantic Analysis[J]. Journal of the American Society for Information Science, 1990, 41(6): 391-407.
doi: 10.1002/(ISSN)1097-4571
[10]
Hofmann T. Unsupervised Learning by Probabilistic Latent Semantic Analysis[J]. Machine Learning, 2001, 42(1): 177-196.
doi: 10.1023/A:1007617005950
[11]
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
(Yin Dehu. Research on Dynamic Evolution Trend of Topic Hotness Based on LDA Model in Online Health Community[D]. Kunming: Kunming University of Science and Technology, 2019.)
(Li Chongyang, Zhai Shanshan, Zheng Lu. Measurement of Information Demand Characteristics in Online Health Community: An Empirical Analysis Based on Time and Theme Perspective[J]. Digital Library Forum, 2016(9): 34-42.)
(An Xinyu, Chen Yuxin, Zhang Han. Need of Online Health Community Users for Information on COVID-19 Infection Pneumonia[J]. Chinese Journal of Medical Library and Information Science, 2021, 30(2): 53-58.)
[15]
Dieng A B, Ruiz F J R, Blei D M. Topic Modeling in Embedding Spaces[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 439-453.
doi: 10.1162/tacl_a_00325
[16]
Titov I, McDonald R. Modeling Online Reviews with Multi-Grain Topic Models[C]// Proceedings of the 17th International Conference on World Wide Web. ACM, 2008: 111-120.
[17]
Angelov D. Top2Vec: Distributed Representations of Topics[OL]. arXiv Preprint, arXiv: 2008.09470.
[18]
Ge J W, Lin S C, Fang Y Q. A Text Classification Algorithm Based on Topic Model and Convolutional Neural Network[J]. Journal of Physics: Conference Series, 2021, 1748(3): 32-36.
(Zhong Jiawa, Liu Wei, Wang Sili, et al. Review of Methods and Applications of Text Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 1-13.)
[20]
Feldman R. Techniques and Applications for Sentiment Analysis[J]. Communications of the ACM, 2013, 56(4): 82-89.
(Bai Gang. Sentiment Analysis of Microblog Comments Based on Semantic and Sentiment Dictionary[J]. Modern Computer, 2021, 27(30): 55-58.)
[22]
Pang B, Lee L, Vaithyanathan S. Thumbs Up? : Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. ACM, 2002: 79-86.
[23]
Wang H, Liu B, Li C Z, et al. Learning with Noisy Labels for Sentence-Level Sentiment Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: ACL, 2019: 6286-6292.
(Zhou Yongdong, Zhang Yun, Cao Yanrong, et al. Sentiment Analysis Based on Piecewise Convolutional Neural Network Combined with Features[J]. Computer Engineering and Design, 2019, 40(10): 3009-3013.)
[25]
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
doi: 10.1162/neco.1997.9.8.1735
pmid: 9377276
[26]
Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005, 18(5-6): 602-610.
doi: 10.1016/j.neunet.2005.06.042
pmid: 16112549
[27]
Zhao Z H, Hao Z H, Wang G C, et al. Sentiment Analysis of Review Data Using Blockchain and LSTM to Improve Regulation for a Sustainable Market[J]. Journal of Theoretical and Applied Electronic Commerce Research, 2021, 17(1): 1-19.
doi: 10.3390/jtaer17010001
(Luo Haoran, Yang Qing. Sentiment Analysis Based on Sentiment Lexicon and Stacked Residual Bi-LSTM Network[J]. Journal of Computer Applications, 2022, 42(4): 1099-1107.)
doi: 10.11772/j.issn.1001-9081.2021071179
(Liu Bing, Li Xin, Zhang Hezhao, et al. Thematic Characteristics and Emotional Factors of Women’s Information Needs During Their Identity Transition Period in the Online Health Community: A Case Study of the “Pregnant Section” in “Mama.cn”[J]. Information Studies: Theory & Application, 2019, 42(5): 87-92.)
(Wang Xiwei, Li Yueqi, Liu Tingyan, et al. Research on the Collaborative Model of Sentiment Analysis and Topic Mining of Micro-Blogging Users in the Context of COVID-19[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(3): 223-233.)
[31]
Fu X H, Liu G, Guo Y Y, et al. Multi-Aspect Blog Sentiment Analysis Based on LDA Topic Model and HowNet Lexicon[C]// Proceedings of the 2011 International Conference on Web Information Systems and Mining. Berlin, Heidelberg: Springer, 2011: 131-138.
[32]
Pathak A R, Pandey M, Rautaray S. Topic-Level Sentiment Analysis of Social Media Data Using Deep Learning[J]. Applied Soft Computing, 2021, 108: Article No.107440.
[33]
Mishra R K, Urolagin S, Jothi J A A, et al. Deep Learning-Based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic[J]. Frontiers in Computer Science, 2021, 3: Article No.775368.
[34]
Mei Q Z, Ling X, Wondra M, et al. Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs[C]// Proceedings of the 16th International Conference on World Wide Web. ACM, 2007: 171-180.
[35]
Lin C H, He Y L. Joint Sentiment/Topic Model for Sentiment Analysis[C]// Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, 2009: 375-384.
[36]
Lin C H, He Y L, Everson R, et al. Weakly Supervised Joint Sentiment-Topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
doi: 10.1109/TKDE.2011.48
[37]
Dermouche M, Kouas L, Velcin J, et al. A Joint Model for Topic-Sentiment Modeling from Text[C]// Proceedings of the 30th Annual ACM Symposium on Applied Computing. ACM, 2015: 819-824.
[38]
Jo Y, Oh A H. Aspect and Sentiment Unification Model for Online Review Analysis[C]// Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 2011: 815-824.
(Sun Yan, Zhou Xueguang, Fu Wei. An Unsupervised Topic and Sentiment Unification Model[J]. Journal of Xi’an Jiaotong University, 2013, 47(1): 120-125.)
[40]
Kim S, Zhang J W, Chen Z, et al. A Hierarchical Aspect-Sentiment Model for Online Reviews[C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. ACM, 2013: 526-533.
[41]
Roberts M E, Stewart B M, Tingley D, et al. The Structural Topic Model and Applied Social Science[C]// Proceedings of the NIPS 2013 Workshop on Topic Models: Computation, Application, and Evaluation. 2013: 1-4.
[42]
Roberts M E, Stewart B M, Tingley D. STM: An R Package for Structural Topic Models[J]. Journal of Statistical Software, 2019, 91(2): 1-40.
[43]
Bai X W, Zhang X N, Li K X, et al. Research Topics and Trends in the Maritime Transport: A Structural Topic Model[J]. Transport Policy, 2021, 102: 11-24.
doi: 10.1016/j.tranpol.2020.12.013
[44]
Park K B, Ha S H. Customer Service Evaluation Based on Online Text Analytics: Sentiment Analysis and Structural Topic Modeling[J]. The Journal of Information Systems, 2017, 26(4): 327-353.
[45]
Lee K R, Kim B, Nan D Y, et al. Structural Topic Model Analysis of Mask-Wearing Issue Using International News Big Data[J]. International Journal of Environmental Research and Public Health, 2021, 18(12): Article No.6432.
[46]
Boukobza A, Burgun A, Roudier B, et al. Deep Neural Networks for Simultaneously Capturing Public Topics and Sentiments During a Pandemic: Application on a COVID-19 Tweet Data Set[J]. JMIR Medical Informatics, 2022, 10(5): Article No.e34306.
[47]
Alhuzali H, Zhang T L, Ananiadou S. Emotions and Topics Expressed on Twitter During the COVID-19 Pandemic in the United Kingdom: Comparative Geolocation and Text Mining Analysis[J]. Journal of Medical Internet Research, 2022, 24(10): Article No.e40323.
(Cheng Wenting, Wu Jiahui. Topic Model and Sentiment Analysis of User Opinions on Preschool Children’s Digital Reading[J]. Library Development, 2022(3): 104-112.)
(Zhao Changyu, Wu Yaping, Wang Jimin. Twitter Text Topic Mining and Sentiment Analysis Under the Belt and Road Initiative[J]. Library and Information Service, 2019, 63(19): 119-127.)
doi: 10.13266/j.issn.0252-3116.2019.19.012
[50]
Dermouche M, Velcin J, Khouas L, et al. A Joint Model for Topic-Sentiment Evolution over Time[C]// Proceedings of the 2014 IEEE International Conference on Data Mining. IEEE, 2014: 773-778.
[51]
Grootendorst M. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure[OL]. arXiv Preprint, arXiv: 2203.05794.
[52]
Dieng A B, Ruiz F J R, Blei D M. The Dynamic Embedded Topic Model[OL]. arXiv Preprint, arXiv: 1907.05545.
(Li Chaoxiong, Huang Faliang, Wen Xiaoqian, et al. Evolution Analysis Method of Microblog Topic-Sentiment Based on Dynamic Topic Sentiment Combining Model[J]. Journal of Computer Applications, 2015, 35(10): 2905-2910.)
doi: 10.11772/j.issn.1001-9081.2015.10.2905
[54]
程惠华. 基于概率图模型的社交媒体情感分析[D]. 福州: 福建师范大学, 2018.
[54]
(Cheng Huihua. Sentiment Analysis of Social Media Based on Probability Graph Model[D]. Fuzhou: Fujian Normal University, 2018.)
[55]
陈毓蔚. 网络舆情热点事件的情感倾向分析与演变过程研究[D]. 杭州: 浙江工商大学, 2018.
[55]
(Chen Yuwei. Research on Sentiment Analysis and Evolution Process for Hot Events of Network Public Opinion[D]. Hangzhou: Zhejiang Gongshang University, 2018.)
[56]
Röder M, Both A, Hinneburg A. Exploring the Space of Topic Coherence Measures[C]// Proceedings of the 8th ACM International Conference on Web Search and Data Mining. ACM, 2015: 399-408.