|
|
Clustering and Characterizing Depression Patients Based on Online Medical Records |
Nie Hui(),Wu Xiaoyan,Lin Yun |
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China |
|
|
Abstract [Objective] This study examines the online consultation records of depression patients, aiming to thoroughly understand their situation. [Methods] First, we retrieved the depression consultation records from haodf.com, an online medical platform. Then, we modeled the patients with word vectors, and identified patient groups with the K-means clustering algorithm. Third, we used visualization techniques, such as t-SNE, heat map, and word-cloud, to analyze the group structure and relationship among them. Finally,we identified the emotional-psychological, social, and behavioral differences of different groups and decided their treatment needs with the LDA topic model. [Results] We found six depression groups with different emotional-psychology, social relationship, and behavioral performance. The depression patients’ needs include: seeking suggestion on offline medical treatments, multi-faceted consultation, and inquiry about medication. [Limitations] We analyzed the differences in group characteristics by selecting keywords in each dimension based on part-of-speech and manual analysis. [Conclusions] The proposed method could help us understand patients and their needs, and then construct better online medical platforms.
|
Received: 23 August 2021
Published: 07 January 2022
|
|
Fund:Guangzhou Science and Technology Plan Project(202002020036) |
Corresponding Authors:
Nie Hui,ORCID:0000-0001-8567-3084
E-mail: issnh@mail.sysu.edu.cn
|
[1] |
孟秋晴, 熊回香. 基于在线问诊文本信息的医生推荐研究[J]. 情报科学, 2021, 39(6):152-160.
|
[1] |
( Meng Qiuqing, Xiong Huixiang. Doctor Recommendation Based on Online Consultation Text Information[J]. Information Science, 2021, 39(6):152-160.)
|
[2] |
WHO. Depression[EB/OL].(2020-01-30)[2021-08-06]. https://www.who.int/news-room/fact-sheets/detail/depression.
|
[3] |
Shen G Y, Jia J, Nie L Q, et al. Depression Detection via Harvesting Social Media: A Multimodal Dictionary Learning Solution[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 3838-3844.
|
[4] |
Yin Z J, Sulieman L M, Malin B A. A Systematic Literature Review of Machine Learning in Online Personal Health Data[J]. Journal of the American Medical Informatics Association, 2019, 26(6):561-576.
doi: 10.1093/jamia/ocz009
|
[5] |
席海涛, 聂文博, 李闺臣, 等. 在线健康社区用户交互的研究现状与进展[J]. 情报科学, 2021, 39(4):186-193.
|
[5] |
( Xi Haitao, Nie Wenbo, Li Guichen, et al. Research Status and Progress of Online Health Community User Interaction[J]. Information Science, 2021, 39(4):186-193.)
|
[6] |
Beykikhoshk A, Arandjelović O, Phung D, et al. Using Twitter to Learn about the Autism Community[J]. Social Network Analysis and Mining, 2015, 5(1):1-17.
doi: 10.1007/s13278-014-0242-0
|
[7] |
Hswen Y, Gopaluni A, Brownstein J S, et al. Using Twitter to Detect Psychological Characteristics of Self-Identified Persons with Autism Spectrum Disorder: A Feasibility Study[J]. JMIR MHealth and UHealth, 2019, 7(2):e12264.
doi: 10.2196/12264
|
[8] |
van der Eijk M, Faber M J, Aarts J W M, et al. Using Online Health Communities to Deliver Patient-Centered Care to People with Chronic Conditions[J]. Journal of Medical Internet Research, 2013, 15(6):e115.
doi: 10.2196/jmir.2476
|
[9] |
Young C. Community Management that Works: How to Build and Sustain a Thriving Online Health Community[J]. Journal of Medical Internet Research, 2013, 15(6):e119.
doi: 10.2196/jmir.2501
|
[10] |
Park A, Conway M, Chen A T. Examining Thematic Similarity, Difference, and Membership in Three Online Mental Health Communities from Reddit: A Text Mining and Visualization Approach[J]. Computers in Human Behavior, 2018, 78:98-112.
doi: 10.1016/j.chb.2017.09.001
|
[11] |
Bi Q Q, Shen L N, Evans R, et al. Determining the Topic Evolution and Sentiment Polarity for Albinism in a Chinese Online Health Community: Machine Learning and Social Network Analysis[J]. JMIR Medical Informatics, 2020, 8(5):e17813.
doi: 10.2196/17813
|
[12] |
盛姝, 黄奇, 郑姝雅, 等. 在线健康社区中用户画像及主题特征分布下信息需求研究——以医享网结直肠癌圈数据为例[J]. 情报学报, 2021, 40(3):308-320.
|
[12] |
( Sheng Shu, Huang Qi, Zheng Shuya, et al. Study of User Information Requirements in an Online Health Community Based on the Distribution of User Profile and Theme Features: Taking Colorectal Cancer Data from YiXiang as an Example[J]. Journal of the China Society for Scientific and Technical Information, 2021, 40(3):308-320.)
|
[13] |
Huh J, Kwon B C, Kim S H, et al. Personas in Online Health Communities[J]. Journal of Biomedical Informatics, 2016, 63:212-225.
doi: 10.1016/j.jbi.2016.08.019
|
[14] |
Bui N, Yen J, Honavar V. Temporal Causality Analysis of Sentiment Change in a Cancer Survivor Network[J]. IEEE Transactions on Computational Social Systems, 2016, 3(2):75-87.
doi: 10.1109/TCSS.2016.2591880
|
[15] |
Chen A T. Exploring Online Support Spaces: Using Cluster Analysis to Examine Breast Cancer, Diabetes and Fibromyalgia Support Groups[J]. Patient Education and Counseling, 2012, 87(2):250-257.
doi: 10.1016/j.pec.2011.08.017
|
[16] |
Feldhege J, Moessner M, Bauer S. Who Says What? Content and Participation Characteristics in an Online Depression Community[J]. Journal of Affective Disorders, 2020, 263:521-527.
doi: 10.1016/j.jad.2019.11.007
|
[17] |
Liu Y, Yin Z J. Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques[J]. Journal of Medical Internet Research, 2020, 22(6):e13745.
doi: 10.2196/13745
|
[18] |
吴江, 刘冠君, 胡仙. 在线医疗健康研究的系统综述: 研究热点、主题演化和研究方法[J]. 数据分析与知识发现, 2019, 3(4):2-12.
|
[18] |
( Wu Jiang, Liu Guanjun, Hu Xian. An Overview of Online Medical and Health Research: Hot Topics, Theme Evolution and Research Content[J]. Data Analysis and Knowledge Discovery, 2019, 3(4):2-12.)
|
[19] |
好大夫在线简介[EB/OL]. [2021-07-09]. https://www.haodf.com/info/aboutus.php.
|
[19] |
(Introduction of haodf.com[EB/OL]. [2021-07-09]. https://www.haodf.com/info/aboutus.php. )
|
[20] |
李丹亚, 胡铁军, 李军莲. MeSH增补概念的术语映射分析[J]. 医学信息学杂志, 2012, 33(4):45-49.
|
[20] |
( Li Danya, Hu Tiejun, Li Junlian. Analysis on Terminology Mapping in MeSH Supplementary Concept[J]. Journal of Medical Informatics, 2012, 33(4):45-49.)
|
[21] |
Mikolov T, Chen K C, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint,arXiv:1301.3781.
|
[22] |
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 2:3111-3119.
|
[23] |
Salton G, Buckley C. Term-Weighting Approaches in Automatic Text Retrieval[J]. Information Processing & Management, 1988, 24(5):513-523.
doi: 10.1016/0306-4573(88)90021-0
|
[24] |
Hinton G E. Visualizing High-Dimensional Data Using t-SNE[J]. Vigiliae Christianae, 2008, 9(2):2579-2605.
|
[25] |
赵华茗, 余丽, 周强. 基于均值漂移算法的文本聚类数目优化研究[J]. 数据分析与知识发现, 2019, 3(9):27-35.
|
[25] |
( Zhao Huaming, Yu Li, Zhou Qiang. Determining Best Text Clustering Number with Mean Shift Algorithm[J]. Data Analysis and Knowledge Discovery, 2019, 3(9):27-35.)
|
[26] |
Bastian M, Heymann S, Jacomy M. Gephi: An Open Source Software for Exploring and Manipulating Networks[C]// Proceedings of the 3rd International Conference on Weblogs and Social Media. AAAI Press, 2009.
|
[27] |
Blei D M, Ng A Y, Jordan M J. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|