Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (11): 52-59    DOI: 10.11925/infotech.2096-3467.2019.0294
Current Issue | Archive | Adv Search |
Discovering Topics of Online Health Community with Q-LDA Model
Lei Yang,Zirun Wang,Guisheng Hou()
College of Economics and Management, Shandong University of Science and Technology, Qingdao 266510, China
Download: PDF(633 KB)   HTML ( 17
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper builds a Q-LDA model to identify topics of online health community, aiming to improve the quality of information generated by the LDA model, as well as its theme representation ability. [Methods] Firstly, we evaluated and weighted the online health information. Then, we constructed a Q-LDA topic mining model based on the LDA model. Finally, we examined the proposed model with real world data. [Results] The Q-LDA model yielded better results than the traditional LDA model. The efficiency of extracting topics was improved by 16%. [Limitations] We only examined the proposed model with textual data from online discussion boards on one disease. [Conclusions] Adding quality of health information to data mining could help us meet the needs of users.

Key wordsText-Data      Online Health Community      Knowledge Discovery      Topic Mining     
Received: 18 March 2019      Published: 18 December 2019
ZTFLH:  C81 G35  
Corresponding Authors: Guisheng Hou     E-mail: houguisheng001@163.com

Cite this article:

Lei Yang,Zirun Wang,Guisheng Hou. Discovering Topics of Online Health Community with Q-LDA Model. Data Analysis and Knowledge Discovery, 2019, 3(11): 52-59.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0294     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I11/52

主题号 Q-LDA建模结果
Topic1 空腹血糖|参考值|血糖值|高于|降糖药物|尿检|波动|范围|低于|重视
Topic4 医院|血糖仪|糖尿病|检查|内分泌|咨询|强化治疗|科室|病史|意见
Topic5 运动|休息|锻炼|重视|饮食|蔬菜|暴食|体重|高热量|控制
Topic27 并发症|糖尿病足|视力|糖尿病肾病|合并糖尿病|典型症状|视网膜病变|尿频|慢性并发症|尿黄
Topic45 胰岛素|抵抗|注射|停药|终身|应用胰岛素|二甲双胍|绝对|药物|调换
主题号 Q-LDA建模结果 LDA建模结果
Topic1 血糖|参考值|血糖值|高于|降糖药物|尿检|波动|范围|低于|重视 血糖|糖耐量|实验|升高|检查|大于等于|检测|诊断|血清|饿
Topic4 医院|血糖仪|糖尿病|检查|内分泌|咨询|强化治疗|科室|病史|意见 医院|方案|选用|体征|遗传病|内科|症状|担心|手术|受损|
Topic5 运动|休息|锻炼|重视|饮食|蔬菜|暴食|体重|高热量|控制 锻炼|运动|清淡饮食|注意|睡眠|适量|蔬菜|体重|控制
Topic27 并发症|糖尿病足|视力|糖尿病肾病|合并糖尿病|典型症状|视网膜病变|尿频|慢性并发症|尿黄 并发症|注意|下降|高峰|颈椎病|过于|妊娠|降糖|范围|备孕
Topic45 胰岛素|抵抗|注射|停药|终身|应用胰岛素|二甲双胍|绝对|药物|调换 胰岛素|健康咨询|随机|使用|强化治疗|感染|抗生素|筛查|习惯|规范
[1] Fox S . The Social Life of Health Information[EB/OL]. [ 2017- 10- 29]. http://www.pewresearch.org/fact-tank/2014/01/15/thesocial-life-of-health-information/ .
[2] Liu Y, Jin J, Ji P , et al. Identifying Helpful Online Reviews: A Product Designer’s Perspective[J]. Computer-Aided Design, 2013,45(2):180-194.
doi: 10.1016/j.cad.2012.07.008
[3] 钱明辉, 徐志轩, 王珊 . 基于用户参与的在线健康平台信息服务质量研究[J]. 情报学报, 2019,38(2):132-142.
[3] ( Qian Minghui, Xu Zhixuan, Wang Shan . Information Service Quality of Online Health Platform Based on User Participation[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(2):132-142.)
[4] 罗晓兰, 韩景倜, 樊卫国 , 等. 互联网时代的健康信息与健康焦虑[J]. 情报资料工作, 2019,40(2):76-86.
[4] ( Luo Xiaolan, Han Jingti, Fan Weiguo , et al. Health Information and Health Anxiety in the Internet Age[J]. Information and Documentation Services, 2019,40(2):76-86.)
[5] 李月琳, 张秀, 王姗姗 . 社交媒体健康信息质量研究:基于真伪健康信息特征的分析[J]. 情报学报, 2018,37(3):294-304.
[5] ( Li Yuelin, Zhang Xiu, Wang Shanshan . Health Information Quality in Social Media: An Analysis Based on the Features of Real and Fake Health Information[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(3):294-304.)
[6] Medical Library Association. The Medical Library Association Task Force on Health Information Literacy [EB/OL]. [2017-02-28]. https://www.mlanet.org/resources/healthlit/define.html.
[7] 张敏, 聂瑞, 罗梅芬 . 健康素养对用户健康信息在线搜索行为的影响分析[J]. 图书情报工作, 2016,60(7):103-109.
[7] ( Zhang Min, Nie Rui, Luo Meifen . Analysis on the Effect of Health Literacy on Users’ Online Health Information Seeking Behavior[J]. Library and Information Service, 2016,60(7):103-109.)
[8] 李月琳, 蔡文娟 . 国外健康信息搜寻行为研究综述[J]. 图书情报工作, 2012,56(19):128-132.
[8] ( Li Yuelin, Cai Wenjuan . A Review of the Studies on Health Information Seeking Behavior Overseas[J]. Library and Information Service, 2012,56(19):128-132.)
[9] 牟冬梅, 琚沅红, 戴文浩 , 等. 虚拟健康社区文本数据知识发现策略与模型[J]. 图书情报工作, 2018,62(5):125-131.
[9] ( Mu Dongmei, Ju Yuanhong, Dai Wenhao , et al. Knowledge Discovery Strategy and Model of Virtual Health Community Text Data[J]. Library and Information Service, 2018,62(5):125-131.)
[10] 莫祖英, 马费成 . 网络环境下信息资源质量控制的博弈分析[J]. 情报理论与实践, 2012,35(8):26-30.
[10] ( Mo Zuying, Ma Feicheng . Game Analysis of Information Resources Quality Control in the Network Environment[J]. Information Studies: Theory & Application, 2012,35(8):26-30.)
[11] 宋立荣, 张群, 齐娜 . 我国医疗健康类网站的信息质量问题分析[J]. 中华医学图书情报杂志, 2014,23(9):1-6.
[11] ( Song Lirong, Zhang Qun, Qi Na . Problems in Information Quality on Medical and Health Websites in China[J]. China Journal of Medical Library and Information Science, 2014,23(9):1-6.)
[12] Shahar S, Shirley N, Noah S A . Quality and Accuracy Assessment of Nutrition Information on the Web for Cancer Prevention[J]. Medical Informatics, 2013,38(1):15-26.
doi: 10.3109/17538157.2012.710684 pmid: 22957981
[13] Bizzi I, Ghezzi P, Paudyal P . Health Information Quality of Websites on Periodontology[J]. Journal of Clinical Periodontology, 2017,44(3):308-314.
doi: 10.1111/jcpe.12668 pmid: 28005268
[14] 赵玉遂, 许燕, 吴青青 , 等. 应用德尔菲法构建网络健康信息质量评价指标体系[J]. 预防医学, 2018,30(2):121-124.
[14] ( Zhao Yusui, Xu Yan, Wu Qingqing , et al. The Development of an Evaluation Index System on Health Information on the Internet Using Delphi Method[J]. Preventive Medicine, 2018,30(2):121-124.)
[15] 钱明辉, 徐志轩, 连漪 . 在线健康咨询平台信息质量评价及其品牌化启示[J]. 情报资料工作, 2018,39(3):57-63.
[15] ( Qian Minghui, Xu Zhixuan, Lian Yi . Information Quality Evaluation and Brand Inspiration of Online Health Consultation Platform[J]. Information and Documentation Services, 2018,39(3):57-63.)
[16] 钟乐, 刘威, 尹飞 . 中文网站中儿童注意缺陷多动障碍相关信息的质量评估[J]. 中国心理卫生杂志, 2010,24(10):780-784.
[16] ( Zhong Le, Liu Wei, Yin Fei . Information Quality Evaluation of Chinese Websites on Attention Deficit Hyperactivity Disorder[J]. Chinese Mental Health Journal, 2010,24(10):780-784.)
[17] Corcelles R, Daigle C, Talamas H R . Assessment of the Quality of Internet Information on Sleeve Gastrectomy[J]. Surgery for Obesity and Related Diseases, 2015,11(3):539-544.
doi: 10.1016/j.soard.2014.08.014 pmid: 25604832
[18] Yagci I A, Das S . Measuring Design-Level Information Quality in Online Reviews[J]. Electronic Commerce Research and Applications, 2018,30:102-110.
doi: 10.1016/j.elerap.2018.05.010
[19] di Sciascio C, Strohmaier D, Errecalde M , et al. WikiLyzer: Interactive Information Quality Assessment in Wikipedia [C]// Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, 2017: 377-388.
[20] Utkin L V . A New Ranking Procedure by Incomplete Pairwise Comparisons Using Preference Subsets[J]. Intelligent Data Analysis, 2009,13(2):229-241.
doi: 10.3233/IDA-2009-0365
[21] Hullermeier E, Furnkranz J . Ranking by Pairwise Comparison a Note on Risk Minimization [C]// Proceedings of the 2004 IEEE International Conference on Fuzzy Systems. 2004: 97-102.
[22] 李红柳, 王兴元 . 在线用户评论对顾客价值创造的影响研究——基于对消费者价格决策的考量[J]. 价格理论与实践, 2018(1):150-152.
[22] ( Li Hongliu, Wang Xingyuan . The Effect of Online User Reviews on Customer Value Creation: From the Perspective of Price Decision[J]. Price: Theory&Practice, 2018(1):150-152.)
[23] Schubert J, Hörling P . Preference-based Monte Carlo Weight Assignment for Multiple-criteria Decision Making in Defense Planning [C]// Proceedings of the 17th International Conference on Information Fusion. IEEE, 2014.
[24] 邓胜利, 赵海平 . 用户视角下网络健康信息质量评价标准框架构建研究[J]. 图书情报工作, 2017,61(21):30-39.
[24] ( Deng Shengli, Zhao Haiping . Research on the Standard Framework of the Quality and the Content Evaluation of Online Health Information from Users’ Perspective[J]. Library and Information Service, 2017,61(21):30-39.)
[25] Liu K Y, Haukoos J S, Sasson C . Availability and Quality of Cardiopulmonary Resuscitation Information for Spanish- speaking Population on the Internet[J]. Resuscitation, 2014,85(1):131-137.
doi: 10.1016/j.resuscitation.2013.08.274
[26] 阮光册 . 基于LDA 的网络评论主题发现研究[J]. 情报杂志, 2014,33(3):161-164.
[26] ( Ruan Guangce . Topic Extraction Research of Net Reviews Based on Latent Dirichlet Allocation[J]. Journal of Intelligence, 2014,33(3):161-164.)
[27] Lu Y, Wu Y, Liu J . Understanding Health Care Social Media Use from Different Stakeholder Perspectives: A Content Analysis of an Online Health Community[J]. Journal of Medical Internet Research, 2017,19(4):e109.
doi: 10.2196/jmir.7087 pmid: 28389418
[28] 李湘东, 丁丛, 高凡 . 基于复合加权LDA模型的书目信息分类方法研究[J]. 情报学报, 2017,36(4):26-34.
[28] ( Li Xiangdong, Ding Cong, Gao Fan . The Research of Bibliographic Information Classification Method Based on the Composite Weighted LDA Model[J]. Journal of the China Society for Scientific and Technical Information, 2017,36(4):26-34.)
[29] Oğuz F, Elif Şengün A . Mystery of the Unknown: Revisiting Tacit Knowledge in the Organizational Literature[J]. Journal of Knowledge Management, 2011,15(3):445-461.
doi: 10.1108/13673271111137420
[30] 邓胜利, 赵海平 . 国外网络健康信息质量评价: 指标、工具及结果研究综述[J]. 情报资料工作, 2017,38(1):69-76.
[30] ( Deng Shengli, Zhao Haiping . Quality Evaluation of Foreign Network Health Information: A Review of Indicators, Tools and Results[J]. Information and Documentation Services, 2017,38(1):69-76.)
[1] Manyu Huang,Qi Yun,Hufeng Peng,Xuemeng Dou. Analyzing Textual Features of Excess-funded Agricultural Products——Case Study of Crowdfunding Website[J]. 数据分析与知识发现, 2019, 3(9): 124-134.
[2] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[3] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[4] Juhua Wu,Yu Wang,Ming Li,Shaoyun Cai. Knowledge Discovery of Online Health Communities with Weighted Knowledge Network[J]. 数据分析与知识发现, 2019, 3(2): 108-117.
[5] Jiying Hu,Jing Xie,Li Qian,Changlei Fu. Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph[J]. 数据分析与知识发现, 2019, 3(1): 55-62.
[6] Shuyi Wang,Huatao Liao,Chake Wu. Mining News on Competitors with Sentiment Classification[J]. 数据分析与知识发现, 2018, 2(3): 70-78.
[7] Xin Wang,Wen’gang Feng. Review of Techniques Detecting Online Extremism and Radicalization[J]. 数据分析与知识发现, 2018, 2(10): 2-8.
[8] Zhiqiang Zhang,Shaoping Fan,Xiujuan Chen. Biomedical Informatics Studies for Knowledge Discovery in Precision Medicine[J]. 数据分析与知识发现, 2018, 2(1): 1-8.
[9] Dongmei Mu,Ping Wang,Danning Zhao. Reducing Data Dimension of Electronic Medical Records: An Empirical Study[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
[10] Xiufang Xie,Xiaolin Zhang. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
[11] Mu Dongmei,Ren Ke. Discovering Knowledge from Electronic Medical Records with Three Data Mining Algorithms[J]. 现代图书情报技术, 2016, 32(6): 102-109.
[12] Liu Hongxu,Qu Jiansheng. Using Meta-analysis Software for Domain Knowledge Discovery[J]. 现代图书情报技术, 2016, 32(5): 9-21.
[13] Yang Haixia,Gao Baojun,Sun Hanlin. Extracting Topics of Computer Science Literature with LDA Model[J]. 现代图书情报技术, 2016, 32(11): 20-26.
[14] Ku Liping. Research on Article-Level Metrics (ALMs): A Case Analysis[J]. 现代图书情报技术, 2013, 29(11): 1-7.
[15] Song Wen, Huang Jinxia, Liu Yi, Tang Yijie. SKE Key Technologies and Services for Knowledge Discovery[J]. 现代图书情报技术, 2012, 28(7): 13-18.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn